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Abstract 

Research  in  the  past  decade  has  established  capacity  theorems  for  point-to-point 
bosonic  channels  with  additive  thermal  noise,  under  the  presumption  of  a  conjec¬ 
ture  on  the  minimum  output  von  Neumann  entropy.  In  the  first  part  of  this  thesis, 
we  evaluate  the  optimum  capacity  for  free-space  line-of-sight  optical  communication 
using  Gaussian-attenuation  apertures.  Optimal  power  allocation  across  all  the  spatio- 
temporal  modes  is  studied,  in  both  the  far-held  and  near-held  propagation  regimes. 
We  establish  the  gap  between  ultimate  capacity  and  data  rates  achievable  using  clas¬ 
sical  encoding  states  and  structured  receivers.  The  remainder  of  the  thesis  addresses 
the  ultimate  capacity  of  bosonic  broadcast  channels,  i.e.,  when  one  transmitter  is  used 
to  send  information  to  more  than  one  receiver.  We  show  that  when  coherent-state 
encoding  is  employed  in  conjunction  with  coherent  detection,  the  bosonic  broadcast 
channel  is  equivalent  to  the  classical  degraded  Gaussian  broadcast  channel  whose  ca¬ 
pacity  region  is  known.  We  draw  upon  recent  work  on  the  capacity  region  of  the 
two-user  degraded  quantum  broadcast  channel  to  establish  the  ultimate  capacity  re¬ 
gion  for  the  bosonic  broadcast  channel,  under  the  presumption  of  another  conjecture 
on  the  minimum  output  entropy.  We  also  generalize  the  degraded  broadcast  channel 
capacity  theorem  to  more  than  two  receivers,  and  prove  that  if  the  above  conjecture 
is  true,  then  the  rate  region  achievable  using  a  coherent-state  encoding  with  optimal 
joint-detection  measurement  at  the  receivers  would  be  the  ultimate  capacity  region 
of  the  bosonic  broadcast  channel  with  loss  and  additive  thermal  noise.  We  show  that 
the  minimum  output  entropy  conjectures  restated  for  Wehrl  entropy,  are  immediate 
consequences  of  the  entropy  power  inequality  (EPI).  We  then  show  that  an  EPI-like 
inequality  for  von  Neumann  entropy  would  imply  all  the  minimum  output  entropy 
conjectures  needed  for  our  channel  capacity  results.  We  call  this  new  conjectured 
result  the  Entropy  Photon-Number  Inequality  (EPnl). 

Thesis  Supervisor:  Jeffrey  H.  Shapiro 

Title:  Julius  A.  Stratton  Professor  of  Electrical  Engineering 
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2-7  Comparison  of  capacities  (in  bits  per  channel  use)  of  the  single-mode 
lossy  bosonic  channel  achieved  by:  OOK  modulation  with  direct  detec¬ 
tion;  (|a),  —  |o?)}-BPSK  modulation  using  coherent-states;  and  homo¬ 
dyne  and  heterodyne  detection  with  isotropic- Gaussian  random  coding 
over  coherent  states.  For  very  low  values  of  N,  the  average  transmitter 
photon  number,  shown  in  (a),  OOK  outperforms  all  but  the  ultimate 
capacity.  At  somewhat  higher  values  of  N,  both  OOK  and  BPSK  are 
better  than  isotropic-Gaussian  random  coding  with  coherent  detection. 

In  the  high  N  regime,  coherent-detection  capacities  outperform  the  bi¬ 
nary  schemes,  because,  the  maximum  rate  achievable  by  the  latter 
approaches  cannot  exceed  1  bit  per  channel  use .  56 

2-8  This  figure  illustrates  the  gap  between  the  ultimate  BPSK  coherent- 
state  capacity  (Equation  (2.31))  and  the  achievable  rate  using  a  BPSK 
coherent-state  alphabet  and  symbol- by-symbol  “Dolinar  receiver”  mea¬ 
surement  (Equation  (2.30)).  In  order  to  bridge  the  gap  between  these 
two  capacities,  optimal  multi-symbol  joint  measurement  schemes  must 
be  used  at  the  receiver.  All  capacities  are  plotted  in  units  of  bits  per 
channel  use .  57 


3-1  Classical  additive  Gaussian  noise  broadcast  channel .  65 

3-2  Capacity  region  of  the  classical  additive  Gaussian  noise  broadcast  chan¬ 
nel,  with  an  input  power  constraint  AJIXa]2]  <  10,  and  noise  powers 
given  by,  NB  =  2  and  Nc  =  6.  The  rates  RB  and  Re  are  in  nats  per 
channel  use .  67 
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3-3  A  broadcast  channel  in  which  the  transmitter  Alice  encodes  informa¬ 
tion  into  a  real-valued  a  for  a  classical  electromagnetic  held  (coherent 
state  | a))  and  the  beam  splits  into  two,  through  a  lossless  beam  splitter 
with  transmissivity  rj,  in  presence  of  an  ambient  thermal  environment 
with  an  average  of  N ?  photons  per  mode.  Bob  and  Charlie,  the  two 
receivers,  receive  their  respective  classical  signals  Yg  and  Yc  at  the  two 
output  ports  of  the  beam  splitter  by  performing  optical  homodyne  de¬ 
tection.  In  the  limit  of  high  noise  ( Nt  3>  1),  and  with  the  substitutions 
Xa  =  a;  a  G  M,  and  Nt  =  2N,  this  channel  reduces  to  the  broadcast 
channel  model  described  by  (3.18) .  68 


3-4  Schematic  diagram  of  the  degraded  single-mode  bosonic  broadcast  chan¬ 
nel.  The  transmitter  Alice  (A)  encodes  her  messages  to  Bob  ( B )  and 
Charlie  ( C )  in  a  classical  index  j,  and,  over  n  successive  uses  of  the 
channel,  creates  a  bipartite  state  pfnc"  at  the  receivers .  71 
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3-5  This  figure  summarizes  the  setup  of  the  transmitter  and  the  channel 
model  for  the  M- receiver  quantum  degraded  broadcast  channel.  In 
each  successive  n  uses  of  the  channel,  the  transmitter  A  sends  a  ran¬ 
domly  generated  classical  message  (m0, . . . ,  uim- i)  £  (W0, . . . ,  Wm- i) 
to  the  M  receivers  Y0} . . .,  YM_il  where  the  message-sets  Wk  are  sets 
of  classical  indices  of  sizes  2nRk,  for  k  G  {0, . . . ,  M  —  1}.  The  dashed 
arrows  indicate  the  direction  of  degradation,  i.e. ,  Y0  is  the  least  noisy 
receiver,  and  Ym- i  is  the  noisiest  receiver.  In  this  degraded  channel 
model,  the  quantum  state  received  at  the  receiver  Yk,  pYk  can  always 
be  reconstructed  from  the  quantum  state  received  at  the  receiver  Yki . 
pYk' ,  for  k'  <  k ,  by  passing  fAk'  through  a  trace-preserving  completely 
positive  map  (a  quantum  channel).  For  sending  the  classical  mes¬ 
sage  (mo, . . . ,  —  j,  Alice  chooses  a  n-use  state  (codeword)  pfn 

using  a  prior  distribution  Pj\iir  where  '4  denotes  the  complex  values 
taken  by  an  auxiliary  random  variable  Tk.  It  can  be  shown  that, 
in  order  to  compute  the  capacity  region  of  the  quantum  degraded 
broadcast  channel,  we  need  to  choose  M  —  1  complex  valued  auxil¬ 
iary  random  variables  with  a  Markov  structure  as  shown  above,  i.e., 
Tm- i  — 1 >  TM-2  —■ ”  •  •  •  - 1 ”  Tk  — >  . . .  — »•  Ti  — >  An  is  a  Markov  chain.  .  .  . 
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3-6  This  figure  illustrates  the  decoding  end  of  the  M- receiver  quantum 
degraded  broadcast  channel.  The  decoder  consists  of  a  set  of  mea¬ 
surement  operators,  described  by  positive  operator-valued  measures 
(POVMs)  for  each  receiver;  {a^o  mM_J,  {a^  {a^J 

on  3V\  Ain,  •  •  •)  Am- in  respectively.  Because  of  the  degraded  nature 
of  the  channel,  if  the  transmission  rates  are  within  the  capacity  region 
and  proper  encoding  and  decoding  are  employed  at  the  transmitter 
and  at  the  receivers  respectively,  Y0  can  decode  the  entire  message  M - 
tuple  to  obtain  estimates  (m^, . . .  ,  rh°M_i),  Y\  can  decode  the  reduced 
message  (M  —  l)-tuple  to  obtain  its  own  estimates  (■ m\ , . . .  ,  rh1M_1), 
and  so  on,  until  the  noisiest  receiver  Ym-i  can  only  decode  the  single 
message-index  rriM-i  to  obtain  an  estimate  rh^z\-  Even  though  the 
less  noisy  receivers  can  decode  the  messages  of  the  noisier  receivers, 
the  message  mk  is  intended  to  be  sent  to  receiver  Yk,  \/k.  Hence,  when 
we  say  that  a  broadcast  channel  is  operating  at  a  rate  (R0, . . . ,  Rm- i), 
we  mean  that  the  message  nik  is  reliably  decoded  by  receiver  Yk  at  the 
rate  Rk  bits  per  channel  use .  75 


3-7  A  single-mode  noiseless  bosonic  broadcast  channel  with  two  receivers 
JVa  —BCi  can  be  envisioned  as  a  beam  splitter  with  transmissivity  rj. 

With  r]  >  1/2,  the  bosonic  broadcast  channel  reduces  to  a  degraded 
quantum  broadcast  channel,  where  Bob  ( B )  is  the  less-noisy  receiver 
and  Charlie  (C)  is  the  more  noisy  (degraded)  receiver .  82 


3-8  The  stochastically  degraded  version  of  the  single-mode  bosonic  broad¬ 
cast  channel .  82 
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3-9  Comparison  of  bosonic  broadcast  channel  capacity  regions,  in  bits  per 
channel  use,  achieved  by  coherent-state  encoding  using  homodyne  de¬ 
tection  (the  capacity  region  lies  inside  the  boundary  marked  by  cir¬ 
cles),  heterodyne  detection  (the  capacity  region  lies  inside  the  bound¬ 
ary  marked  by  dashes),  and  optimum  reception  (the  capacity  region 
lies  inside  the  boundary  marked  by  the  solid  curve),  for  rj  =  0.8,  and 
N  =  1,  5,  and  15 .  90 


3-10  A  single-mode  noiseless  bosonic  broadcast  channel  with  two  receivers 
Ma-bCi  with  additive  thermal  noise.  The  transmitter  Alice  (A)  is 
constrained  to  use  N  photons  per  use  of  the  channel,  and  the  noise 
(environment)  mode  is  in  a  zero-mean  thermal  state  Pt,n,  with  mean 
photon  number  N.  With  r)  >  1/2,  the  bosonic  broadcast  channel 
reduces  to  a  degraded  quantum  broadcast  channel,  where  Bob  ( B )  is 
the  less-noisy  receiver  and  Charlie  (C)  is  the  more  noisy  (degraded) 
receiver.  See  the  degraded  version  of  the  channel  in  Fig.  3-11 .  91 


3-11  The  stochastically  degraded  version  of  the  single-mode  bosonic  broad¬ 
cast  channel  with  additive  thermal  noise .  92 
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3-12  An  M-receiver  noiseless  bosonic  broadcast  channel.  Transmitter  Al¬ 
ice  (A)  sends  independent  messages  to  M  receivers,  Y0, ... ,  Ym-i-  We 
have  labeled  Alice’s  modal  annihilation  operator  as  a,  and  those  of 
the  receivers  Y]  as  yi ,  V/  G  {0,  ...,M  —  1}.  In  order  to  character¬ 
ize  the  bosonic  broadcast  channel  as  a  quantum-mechanically  correct 
representation  of  the  evolution  of  a  closed  system,  we  must  incorpo¬ 
rate  M  —  1  environment  inputs  {Ekl . . . ,  Em~i}  along  with  the  trans¬ 
mitter  A  (whose  modal  annihilation  operators  have  been  labeled  as 
{ei, . . . ,  e-M- 1 }) ,  such  that  the  M  output  annihilation  operators  are  re¬ 
lated  to  the  M  input  annihilation  operators  through  a  unitary  matrix, 
as  given  in  Eq.  (3.93).  For  the  noiseless  bosonic  broadcast  channel,  all 
the  M  —  1  environment  modes  ek  are  in  their  vacuum  states.  The  trans¬ 
mitter  is  constrained  to  at  most  N  photons  on  an  average  per  channel 
use,  for  encoding  the  data.  The  fractional  power  coupling  from  the 
transmitter  to  the  receiver  Yk  is  taken  to  be  77*. .  We  have  labeled  the 
receivers  in  such  a  way,  that  1  >  rjo  >  rji  >  . . .  >  t}m-i  >  0.  This 
ordering  of  the  transmissivities  renders  this  channel  a  degraded  quan¬ 
tum  broadcast  channel  A  — >  Y '0  — >  . . .  — ■>  YM_X  (See  Fig.  3-13).  The 
fractional  power  coupling  from  Ek  to  Y\  has  been  taken  to  be  rjki .  For 
M  =  2,  the  above  channel  model  reduces  to  the  familiar  two-receiver 
beam  splitter  channel  model  as  given  in  Fig.  3-7 . 
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3-13  An  equivalent  stochastically  degraded  model  for  the  M- receiver  noise¬ 
less  bosonic  broadcast  channel  depicted  in  Fig.  3-12.  If  the  receivers 
are  ordered  in  a  way  such  that  the  fractional  power  couplings  rjk  from 
the  transmitter  to  the  receiver  Yk  are  in  decreasing  order,  the  quantum 
states  at  each  receiver  Yk,  for  k  e  {1, . . . ,  M  —  1},  can  be  obtained  from 
the  state  received  at  receiver  Yk~ i  by  mixing  it  with  a  vacuum  state, 
through  a  beam  splitter  of  transmissivity  r]k/Vk-i-  This  equivalent  rep¬ 
resentation  of  the  M-receiver  bosonic  broadcast  channel  confirms  that 
the  bosonic  broadcast  channel  is  indeed  a  degraded  broadcast  channel, 
whose  capacity  region  is  given  by  the  infinite- dimensional  (continuous- 
variable)  extension  of  Yard  et.  al.’s  theorem  in  Eqs.  (3.38) . 
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3-14  In  order  to  evaluate  the  capacity  region  of  the  M- receiver  noiseless 
bosonic  degraded  broadcast  channel  depicted  in  Fig.  3-13  using  a  coherent- 
state  input  alphabet  (|a)},  aeC  and  (a)a)  =  (|cv|2)  <  N,  we  choose 
the  M  —  1  auxiliary  classical  Markov  random  variables  (in  Eqs.  (3.35)) 
as  complex- valued  random  variables  Tk,  k  G  {1, . . . ,  M  —  1},  taking 
values  Tk  G  C.  In  order  to  visualize  the  postulated  optimal  Gaussian 
distributions  for  the  random  variables  Tkl  let  us  associate  with  Tkl  a 
quantum  system,  i.e.,  a  coherent-set  alphabet  { | ) }  and  modal  anni¬ 
hilation  operator  tk,  V/c.  In  accordance  with  the  Markov  property  of 
the  random  variables  Tk,  let  tM- i  be  in  an  isotropic  zero-mean  Gaus¬ 
sian  mixture  of  coherent-states  with  a  variance  N  (see  Eq.  (3.104)), 
and  for  k  G  {1, . . . ,  M  —  2},  let  tk  be  obtained  from  tk+\  by  mixing 
it  with  another  mode  uk+ i  excited  in  a  zero-mean  thermal  state  with 
mean  photon  number  N,  through  a  beam  splitter  with  transmissivity 
1  —  7fc+i,  as  shown  in  the  figure  above,  for  some  7 fc+1  G  (0,1).  We 
complete  the  Markov  chain  Tm-  1  — 1 ►  . . .  — >•  T\  — >  A,  by  obtaining  the 
transmitter  mode  a  by  mixing  t\  with  a  mode  u\  excited  in  a  zero-mean 
thermal  state  with  mean  photon  number  N,  through  a  beam  splitter 
with  transmissivity  1  —  71,  for  7x  G  (0,1).  The  above  setup  of  the 
auxiliary  modes  gives  rise  to  the  distributions  given  in  Eqs.  (3.104), 
which  we  use  to  evaluate  the  achievable  rate  region  of  the  M- receiver 
bosonic  broadcast  channel  using  coherent-state  encoding . 


101 


22 


3-15  Comparison  of  bosonic  broadcast  and  multiple-access  channel  capacity 
regions  for  rj  =  0.8,  and  N  =  15.  The  rates  are  in  the  units  of  bits 
per  channel  use.  The  red  line  is  the  conjectured  ultimate  broadcast 
capacity  region,  which  lies  below  the  green  line  -  the  envelope  of  the 
MAC  capacity  regions.  Assuming  that  the  optimum  modulation,  cod¬ 
ing,  and  receivers  are  available,  on  a  fixed  beam  splitter  with  the  same 
power  budget,  more  collective  classical  information  can  be  sent  when 
this  beam  splitter  is  used  as  a  multiple-access  channel,  as  opposed  to 
when  it  is  used  as  a  broadcast  channel.  This  is  unlike  the  case  of 
the  classical  MIMO  Gaussian  multiple-access  and  broadcast  channels 
(BC),  where  a  duality  holds  between  the  MAC  and  BC  capacity  regions. Ill 


3-16  Schematic  diagram  of  the  single-mode  bosonic  wiretap  channel.  The 
transmitter  Alice  (A)  encodes  her  messages  to  Bob  ( B )  in  a  classical 
index  j,  and  over  n  successive  uses  of  the  channel,  thus  preparing  a 
bipartite  state  pBnEn  where  En  represents  n  channel  uses  of  an  eaves¬ 
dropper  Eve  ( E ) .  115 


4-1  This  figure  presents  empirical  evidence  in  support  of  weak  conjecture 
2.  The  input  pA  =  1 0)  (0 1  is  in  its  vacuum  state.  For  a  fixed  value  of 
S(pB),  we  choose  three  different  inputs  pB ,  each  one  diagonal  in  the 
Fock-state  basis,  i.e.  pB  =  Y^LoPn\n)(n\  with  0Pn  —  1-  The 
three  different  inputs  pB  correspond  to  choosing  the  distribution  {pn} 
to  be  a  Binomial  distribution  (blue  curve),  a  Poisson  distribution  (red 
curve)  and  a  Bose-Einstein  distribution  (green  curve).  As  expected, 
we  see  that  the  output  state  pc  has  the  lowest  entropy  when  pB  is  a 
thermal  state,  i.e.  when  {pn}  is  a  Bose-Einstein  distribution .  127 
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A-l  Balanced  homodyne  detection.  Homodyne  detection  is  used  to  measure 
one  quadrature  of  the  field.  The  signal  field  a  is  mixed  on  a  50-50  beam 
splitter  with  a  local  oscillator  excited  in  a  strong  coherent  state  with 
phase  6 ,  that  has  the  same  frequency  as  the  signal.  The  outputs  beams 
are  incident  on  a  pair  of  photodiodes  whose  photocurrent  outputs  are 
passed  through  a  differential  amplifier  and  a  matched  filter  to  produce 
the  classical  output  ag.  If  the  input  a  is  in  a  coherent  state  |a),  then 
the  output  of  homodyne  detection  is  predicted  correctly  by  both  the 
semiclassical  and  the  quantum  theories,  i.e.,  a  Gaussian-distributed 
real  number  ag  with  mean  acosO  and  variance  1/4.  If  the  input  state 
is  not  a  classical  (coherent)  state,  then  the  quantum  theory  must  be 
used  to  correctly  account  for  the  statistics  of  the  outcome,  which  is 
given  by  the  measurement  of  the  quadrature  operator  K(ae_:,e).  .  .  .  181 


A-2  Balanced  heterodyne  detection.  Heterodyne  detection  is  used  to  mea¬ 
sure  both  quadratures  of  the  field  simultaneously.  The  signal  field  a 
is  mixed  on  a  50-50  beam  splitter  with  a  local  oscillator  excited  in  a 
strong  coherent  state  with  phase  6  =  0,  whose  frequency  is  offset  by  an 
intermediate  (radio)  frequency,  uqp,  from  that  of  the  signal.  The  out¬ 
puts  beams  are  incident  on  a  pair  of  photodiodes  whose  photocurrent 
outputs  are  passed  through  a  differential  amplifier.  The  output  cur¬ 
rent  of  the  differential  amplifier  is  split  into  two  paths  and  the  two  are 
multiplied  by  a  pair  of  strong  orthogonal  intermediate-frequency  oscil¬ 
lators  followed  by  detection  by  a  pair  of  matched  filters,  to  yield  two 
classical  outcomes  oq  and  a2-  If  the  input  is  a  coherent  state  |a),  then 
both  semiclassical  and  quantum  theories  predict  the  outputs  (oq ,  a2) 
to  be  a  pair  of  real  variance-1/2  Gaussian  random  variables  with  means 
(9R(a),  A(a)).  For  a  general  input  state  p,  the  outcome  of  heterodyne 
measurement  (oq ,  a2)  has  a  distribution  given  by  the  Husimi  function 
of  p  given  by  Qp(a)  =  {a\p\a)  /it .  182 
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B-l  This  figure  summarizes  the  setup  of  the  transmitter  and  the  channel 
model  for  the  M- receiver  quantum  degraded  broadcast  channel.  In 
each  successive  n  uses  of  the  channel,  the  transmitter  A  sends  a  ran¬ 
domly  generated  classical  message  (m0, . . . ,  uim- i)  £  (W0, . . . ,  Wm- i) 
to  the  M  receivers  Y0} . . .,  YM_il  where  the  message-sets  Wk  are  sets 
of  classical  indices  of  sizes  2nRk,  for  k  G  {0, . . . ,  M  —  1}.  The  dashed 
arrows  indicate  the  direction  of  degradation,  i.e.  Y0  is  the  least  noisy 
receiver,  and  Ym_i  is  the  noisiest  receiver.  In  this  degraded  channel 
model,  the  quantum  state  received  at  the  receiver  Yk,  pYk  can  always 
be  reconstructed  from  the  quantum  state  received  at  the  receiver  Yki . 
pYk' ,  for  k'  <  k ,  by  passing  fAk'  through  a  trace-preserving  completely 
positive  map  (a  quantum  channel).  For  sending  the  classical  mes¬ 
sage  (mo, . . . ,  —  j,  Alice  chooses  a  n-use  state  (codeword)  pf" 

using  a  prior  distribution  Pj\iir  where  '4  denotes  the  complex  values 
taken  by  an  auxiliary  random  variable  Tk.  It  can  be  shown  that, 
in  order  to  compute  the  capacity  region  of  the  quantum  degraded 
broadcast  channel,  we  need  to  choose  M  —  1  complex  valued  auxil¬ 
iary  random  variables  with  a  Markov  structure  as  shown  above,  i.e. 
Tm- i  — 1 >  Tm- 2  — * >  •  •  •  — 1 ►  Tk  — >  . . .  — >■  Ti  — >  An  is  a  Markov  chain.  .  .  . 
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B-2  This  figure  illustrates  the  decoding  end  of  the  M-receiver  quantum 
degraded  broadcast  channel.  The  decoder  consists  of  a  set  of  mea¬ 
surement  operators,  described  by  positive  operator-valued  measures 
(POVMs)  for  each  receiver;  {a^o  mM_J,  {a^  {a^J 

on  3V\  Ain,  •  •  •)  Am- in  respectively.  Because  of  the  degraded  nature 
of  the  channel,  if  the  transmission  rates  are  within  the  capacity  region 
and  proper  encoding  and  decoding  are  employed  at  the  transmitter 
and  at  the  receivers  respectively,  Y0  can  decode  the  entire  message  M - 
tuple  to  obtain  estimates  (m^, . . .  ,  rh°M_i),  Y\  can  decode  the  reduced 
message  (M  —  l)-tuple  to  obtain  its  own  estimates  (■ m\ , . . .  ,  rh1M_1), 
and  so  on,  until  the  noisiest  receiver  YM_ 1  can  only  decode  the  single 
message-index  tum- i  to  obtain  an  estimate  rh^z\-  Even  though  the 
less  noisy  receivers  can  decode  the  messages  of  the  noisier  receivers, 
the  message  mk  is  intended  to  be  sent  to  receiver  Yk,  \/k.  Hence,  when 
we  say  that  a  broadcast  channel  is  operating  at  a  rate  ( R0 , . . . ,  Rm- i), 
we  mean  that  the  message  nik  is  reliably  decoded  by  receiver  Yk  at  the 
rate  Rk  bits  per  channel  use . 
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Chapter  1 


Introduction 


The  objective  of  any  communication  system  is  to  transfer  information  from  one  point 
to  another  efficiently,  given  the  constraints  on  the  available  physical  resources.  In 
most  communication  systems,  the  transfer  of  information  is  done  by  superimposing 
the  information  onto  an  electromagnetic  (EM)  wave.  The  EM  wave  is  known  as  the 
carrier  and  the  process  of  superimposing  information  onto  the  carrier  wave  is  known 
as  modulation.  The  modulated  carrier  is  then  transmitted  to  the  destination  through 
a  noisy  medium,  called  the  communication  channel.  At  the  receiver,  the  noisy  wave 
is  received  and  demodulated  to  retrieve  the  information  as  accurately  as  possible. 
Such  systems  are  often  characterized  by  the  location  of  the  carrier  wave’s  frequency 
within  the  electromagnetic  spectrum.  In  radio  systems  for  example,  the  carrier  wave 
is  selected  from  the  radio  frequency  (RF)  portion  of  the  spectrum. 

In  an  optical  communication  system,  the  carrier  wave  is  selected  from  the  optical 
range  of  frequencies,  which  includes  the  infrared,  visible  light,  and  ultraviolet  frequen¬ 
cies.  The  main  advantage  of  communicating  with  optical  frequencies  is  the  potential 
increase  in  information  that  can  be  transmitted  because  of  the  possibility  of  har¬ 
nessing  an  immense  amount  of  bandwidth.  The  amount  of  information  transmitted 
in  any  communication  system  depends  directly  on  the  bandwidth  of  the  modulated 
carrier,  which  is  usually  a  fraction  of  the  carrier  wave’s  frequency.  Thus  increasing 
the  carrier  frequency  increases  the  available  transmission  bandwidth.  For  example, 
the  frequencies  in  the  optical  range  would  typically  have  a  usable  transmission  band- 
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width  about  three  to  four  orders  of  magnitude  greater  than  that  of  a  carrier  wave 
in  the  RF  region.  Another  important  advantage  of  optical  communications  relative 
to  RF  systems  comes  from  their  narrower  transmitted  beams  —  /xRad  beam  diver¬ 
gences  are  possible  with  optical  systems.  These  narrower  beamwidths  deliver  power 
more  efficiently  to  the  receiver  aperture.  Narrow  beams  also  enhance  communication 
security  by  making  it  hard  for  an  eavesdropper  to  intercept  an  appreciable  amount  of 
the  transmitted  power.  Communicating  with  optical  frequencies  has  some  challenges 
associated  with  it  as  well.  As  optical  frequencies  are  accompanied  by  extremely  small 
wavelengths,  the  design  of  optical  components  require  completely  different  techniques 
than  conventional  microwave  or  RF  communication  systems.  Also,  the  advantage 
that  optical  communication  derives  from  its  comparatively  narrow  beam  introduces 
the  need  for  high-accuracy  beam  pointing.  RF  beams  require  much  less  pointing 
accuracy.  Progress  in  the  theoretical  study  of  optical  communication,  the  advent  of 
laser  -  a  high-power  optical  carrier  source,  the  developments  in  the  held  of  optical 
fiber-based  communication,  and  the  development  of  novel  wideband  optical  modu¬ 
lators  and  efficient  detectors,  have  made  optical  communication  emerge  as  a  held  of 
immense  technological  importance  [1]. 

The  held  of  information  theory,  which  was  born  from  Claude  Shannon’s  revolution¬ 
ary  1948  paper  [2],  addresses  ultimate  limits  on  data  compression  and  communication 
rates  over  noisy  communication  channels.  It  tells  us  how  to  compute  the  maximum 
rate  at  which  reliable  data  communication  can  be  achieved  over  a  noisy  communica¬ 
tion  channel  by  appropriately  encoding  and  decoding  the  data.  This  ultimate  data 
rate  is  known  as  the  channel  capacity  [2,  3,  4],  Information  theory  also  tells  us  how 
to  compute  the  maximum  extent  a  given  set  of  data  can  be  compressed  so  that  the 
original  data  can  be  recovered  within  a  specihed  amount  of  tolerable  distortion  level. 
Unfortunately,  information  theory  does  not  give  us  the  exact  algorithm  (or  the  op¬ 
timal  code)  that  would  achieve  capacity  on  a  given  channel,  nor  does  it  tell  us  how 
to  optimally  compress  a  given  set  of  data.  Nevertheless,  it  sets  ultimate  limits  on 
communication  and  data  compression  that  are  essential  to  meaningfully  determine 
how  well  a  real  system  is  actually  performing. 


The  performance  of  communication  systems  that  rely  on  electromagnetic  wave 
propagation  are  ultimately  limited  by  noise  of  quantum-mechanical  origin.  More¬ 
over,  high-sensitivity  photodetection  systems  have  long  been  close  to  this  noise  limit. 
Hence  determining  the  ultimate  capacities  of  lasercom  channels  is  of  immediate  rel¬ 
evance.  Much  work  has  already  been  done  on  quantum  information  theory  [5,  6], 
which  sets  ultimate  limits  on  the  rates  of  reliable  communication  of  classical  informa¬ 
tion  and  quantum  information  over  quantum  communication  channels.  As  in  classical 
information  theory,  quantum  information  theory  does  not  tell  us  the  transmitter  and 
receiver  structures  that  would  achieve  the  best  communication  rates  for  specific  forms 
of  quantum  noise.  Nevertheless,  the  limits  set  by  quantum  information  theory  are  ex¬ 
tremely  useful  in  determining  the  degree  to  which  available  technology  can  approach 
the  ultimate  performance  bounds. 

The  most  famous  classical  channel  capacity  formula  is  Shannon’s  result  for  the 
classical  additive  white  Gaussian  noise  channel.  For  a  complex-valued  channel  model 
in  which  we  transmit  a  and  receive  c  =  y/rj  a  +  y/1  —  rj  b ,  where  0  <  r)  <  1  is  the 
channel’s  transmissivity  and  b  is  a  zero-mean,  isotropic,  complex-valued  Gaussian 
random  variable  that  is  independent  of  a,  Shannon’s  capacity  is 

^classical  =  ln[l  +  r}N  /  (1  -  rj)N\  nats/use,  (1.1) 

when  E(\a\2)  <  N  and  E(\b\ 2)  =  N. 

The  lossy  bosonic  channel  provides  a  quantum  model  for  optical  communication 
systems  that  rely  on  fiber  or  free-space  propagation.  In  this  quantum  channel  model, 
we  control  the  state  of  an  electromagnetic  mode  with  photon  annihilation  operator 
a  at  the  transmitter,  and  receive  another  mode  with  photon  annihilation  operator 
c  =  yjf\  a  +  y/1  —  r)  b,  where  b  is  the  annihilation  operator  of  a  noise  mode  that  is 
in  a  zero-mean,  isotropic,  complex-valued  Gaussian  state.  For  lasercom,  if  quantum 
measurements  corresponding  to  ideal  optical  homodyne  or  heterodyne  detection  are 
employed  at  the  receiver,  this  quantum  channel  reduces  to  a  real-valued  (homodyne) 
or  complex-valued  (heterodyne)  additive  Gaussian  noise  channel,  from  which  the 
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following  capacity  formulas  (in  nats/use)  follow: 


Chomodyne  =  ^  ln[l  +  AtjN / (2(1  —  rj)N  +  1)]  (1.2) 

^heterodyne  =  ln[l  +  r}N  /  {{  1  -  7 j)N  +  1)],  (1.3) 

where  (aJa)  <  N  and  (tfb)  =  N,  with  angle  brackets  used  to  denote  quantum  aver¬ 
aging.  The  +1  terms  in  the  noise  denominators  are  quantum  contributions,  so  that 
even  when  the  noise  mode  b  is  unexcited  these  capacities  remain  finite,  unlike  the 
situation  in  Eq.  (1.1). 

The  classical  capacity  of  the  pure-loss  bosonic  channel — in  which  the  b  mode  is 
unexcited  ( N  =  0) — was  shown  in  [7]  to  be  Cpure-ioss  =  g(vN)  nats/use,  where  g(x)  = 
{x  +  1)  ln(x  +  1)  —  x  ln(x)  is  the  Shannon  entropy  of  the  Bose-Einstein  probability 
distribution  with  mean  x.  This  capacity  exceeds  the  IV  =  0  versions  of  Eqs.  (1.2) 
and  (1.3),  as  well  as  the  best  known  bound  on  the  capacity  of  ideal  optical  direct 
detection  [8].  For  this  pure-loss  case,  capacity  has  been  shown  to  be  achievable  using 
single-use  coherent-state  encoding  with  a  Gaussian  prior  density  [7].  The  ultimate 
capacity  of  the  thermal-noise  (N  >  0)  version  of  this  channel  is  bounded  below  by 
Cthermai  >  g(j]N  +  (1  —  77)  IV)  —  g((  1  —  rj)N),  and  this  bound  was  shown  to  be  the 
capacity  if  the  thermal  channel  obeyed  a  certain  minimum  output  entropy  conjecture 
[9].  This  conjecture  states  that  the  von  Neumann  entropy  at  the  output  of  the  thermal 
channel  is  minimized  when  the  a  mode  is  in  its  vacuum  state.  Considerable  evidence 
in  support  of  this  conjecture  has  been  accumulated  [10],  but  it  has  yet  to  be  proven. 
Nevertheless,  the  preceding  lower  bound  already  exceeds  Eqs.  (1.2)  and  (1.3)  as  well 
as  the  best  known  bounds  on  the  capacity  of  direct  detection  [8]. 

Less  is  known  about  the  classical- information  capacity  of  multi-user  bosonic  chan¬ 
nels.  For  multiple-access  bosonic  communications — in  which  two  or  more  senders 
communicate  to  a  common  receiver  over  a  shared  propagation  medium — single-use 
coherent-state  encoding  with  a  Gaussian  prior  and  optimum  measurement  achieves 
the  sum-rate  capacity,  but  it  falls  short  of  achieving  the  ultimate  capacity  in  the 
“corner  regions”  [11].  Moreover,  the  capacity  region  that  is  lost  when  coherent  de- 
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tection  is  employed  instead  of  the  optimum  measurement  has  been  quantified  for  this 
multiple-access  channel.  In  this  thesis  we  will  report  our  capacity  analysis  for  the 
bosonic  broadcast  channel.  As  we  described  in  [12],  this  work  led  to  an  inner  bound 
on  the  capacity  region,  which  we  showed  to  be  the  capacity  region  under  the  pre¬ 
sumption  of  a  second  minimum  output  entropy  conjecture.  Both  of  these  minimum 
output  entropy  conjectures  have  been  proven  if  the  input  states  are  restricted  to  be 
Gaussian,  and,  as  we  will  describe  later  in  this  thesis,  we  have  shown  them  to  be 
equivalent  under  this  input-state  restriction.  We  will  also  show  that  the  second  con¬ 
jecture  will  establish  the  privacy  capacity  of  the  lossy  bosonic  channel,  as  well  as  its 
ultimate  quantum  information  carrying  capacity  [13]. 

The  Entropy  Power  Inequality  (EPI)  from  classical  information  theory  is  widely 
used  in  coding  theorem  converse  proofs  for  Gaussian  channels.  By  analogy  with  the 
EPI,  we  conjecture  its  quantum  version,  viz.,  the  Entropy  Photon- number  Inequality 
(EPnl).  We  will  show  that  the  two  minimum  output  entropy  conjectures  cited  above 
are  simple  corollaries  of  the  EPnl.  Hence,  proving  the  EPnl  would  immediately  estab¬ 
lish  some  key  capacity  results  for  the  capacities  of  bosonic  communication  channels 

[13]- 

We  will  assume  that  the  reader  has  had  some  prior  acquaintance  with  quantum 
mechanics,  quantum  optics  and  information  theory.  We  will  use  standard  notation 
widely  in  use  in  the  quantum  optics  and  information  theory  literature.  For  a  quick 
summary  of  the  background  material  and  notation,  see  Appendix  A.  Chapter  2 
of  this  thesis  reviews  some  of  our  early  work  on  the  single-mode  bosonic  channel 
capacity,  and  describes  capacity  calculations  for  the  free-space  optical  channel  using 
Gaussian-attenuation  transmitter  and  receiver  apertures.  Chapter  3  starts  with  a 
brief  introduction  to  the  capacity  of  classical  discrete  memoryless  broadcast  channels 
and  then  walks  the  reader  through  the  classical-information  capacity  analysis  for  the 
bosonic  broadcast  channel  in  which  a  single  sender  communicates  to  two  or  more 
receivers  through  a  lossless  optical  beam  splitter  with  no  extra  noise  or  with  additive 
thermal  noise.  We  prove  the  ultimate  classical  information  capacities  of  the  bosonic 
broadcast  channel  subject  to  the  minimum  output  entropy  conjectures  elucidated  in 
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Chapter  4.  In  that  chapter  we  describe  three  conjectures  on  the  minimum  output 
entropy  of  bosonic  channels,  none  of  which  have  yet  been  proven.  Proving  these 
conjectures  would,  respectively,  complete  the  proofs  of  the  ultimate  channel  capacity 
of  the  lossy  bosonic  channel  with  additive  thermal  noise,  the  ultimate  capacity  region 
of  the  the  multiple-user  bosonic  broadcast  channel  with  no  extra  noise,  and  that 
of  the  bosonic  broadcast  channel  with  additive  thermal  noise.  Chapter  5  begins 
with  motivating  the  thought  process  that  led  us  to  conjecture  the  quantum  version 
of  the  Entropy  Power  Inequality  (EPI),  which  we  call  the  Entropy  Photon- number 
Inequality  (EPnl).  There  we  show  that  the  EPnl  subsumes  all  the  minimum  output 
entropy  conjectures  described  in  Chapter  4.  We  also  discuss  some  recent  progress 
made  towards  a  proof  of  the  EPnl.  The  rest  of  Chapter  5  delves  briefly  into  some 
interesting  problems  in  the  area  of  quantum  optical  information  theory,  including 
the  additivity  properties  of  quantum  information  theoretic  quantities,  a  quantum 
version  of  the  central  limit  theorem,  and  a  conjecture  on  the  monotonicity  of  quantum 
entropy.  Chapter  6  concludes  the  thesis  with  remarks  on  the  major  open  problems 
ahead  of  us  in  the  theory  of  bosonic  communications  and  comments  on  lines  of  future 
work  in  this  area. 
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Chapter  2 


Point-to-point  Bosonic 
Communication  Channel 

2.1  Background 

Reliable,  high  data  rate  communication — carried  by  electromagnetic  waves  at  mi¬ 
crowave  to  optical  frequencies — is  an  essential  ingredient  of  our  technological  age. 
Information  theory  seeks  to  delineate  the  ultimate  limits  on  reliable  communication 
that  arise  from  the  presence  of  noise  and  other  disturbances,  and  to  establish  means  by 
which  these  limits  can  be  approached  in  practical  systems.  The  mathematical  foun¬ 
dation  for  this  assessment  of  limits  is  Shannon’s  Noisy  Channel  Coding  Theorem  [2], 
which  introduced  the  notion  of  channel  capacity — the  maximum  mutual  information 
between  a  channel’s  input  and  output — as  the  highest  rate  at  which  error-free  commu¬ 
nication  could  be  maintained.  Textbook  treatments  of  channel  capacity  [4], [3]  study 
channel  models — ranging  from  the  binary  symmetric  channel’s  digital  abstraction 
to  the  additive  white-Gaussian-noise  channel’s  idealization  of  thermal-noise-limited 
waveform  transmission — for  which  classical  physics  is  the  underlying  paradigm.  Fun¬ 
damentally,  however,  electromagnetic  waves  are  quantum  mechanical,  i.e.,  they  are 
boson  fields  [14], [15].  Moreover,  high-sensitivity  photodetection  systems  have  long 
been  limited  by  noise  of  quantum  mechanical  origin  [16].  Thus  it  would  seem  that 
determining  the  ultimate  limits  on  optical  communication  would  necessarily  involve 
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an  explicitly  quantum  analysis,  but  such  has  not  been  the  case.  Nearly  all  work 
on  the  communication  theory  of  optical  channels — viz.,  that  done  for  systems  with 
laser  transmitters  and  either  coherent-detection  or  direct-detection  receivers — uses 
semiclassical  (shot-noise)  models  (see,  e.g.,  [1] , [IT]) .  Here,  electromagnetic  waves  are 
taken  to  be  classical  entities,  and  the  fundamental  noise  is  due  to  the  random  re¬ 
lease  of  discrete  charge  carriers  in  the  process  of  photodetection.  Inasmuch  as  the 
quantitative  results  obtained  from  shot-noise  analyses  of  such  systems  are  known  to 
coincide  with  those  derived  in  rigorous  quantum- mechanical  treatments  [18],  it  might 
be  hoped  that  the  semiclassical  approach  would  suffice.  But,  Helstrom’s  derivation 
[19]  of  the  optimum  quantum  receiver  for  binary  coherent-state  (laser  light)  signaling 
demonstrated  that  the  lowest  error  probability,  at  constant  average  photon  number, 
required  a  receiver  that  was  neither  coherent  detection  nor  direct  detection.  That 
Dolinar  [20]  was  able  to  show  how  Helstrom’s  optimum  receiver  could  be  realized 
with  a  photodetection  feedback  system  which  admits  to  a  semiclassical  analysis  did 
not  alleviate  the  need  for  a  fully  quantum-mechanical  theory  of  optical  communi¬ 
cation,  as  Shapiro  et  al.  [21]  soon  proved  that  even  better  binary-communication 
performance  could  be  obtained  by  use  of  two-photon  coherent  state  (now  known  as 
squeezed  state)  light,  for  which  semiclassical  photodetection  theory  did  not  apply. 

In  quantum  mechanics,  the  state  of  a  physical  system  together  with  the  measure¬ 
ment  that  is  made  on  that  system  determine  the  statistics  of  the  outcome  of  that 
measurement,  see,  e.g.,  [14].  Thus  in  seeking  the  classical  information  capacity  of  a 
bosonic  channel,  we  must  allow  for  optimization  over  both  the  transmitted  quantum 
states  and  the  receiver’s  quantum  measurement.  In  particular,  it  is  not  appropriate 
to  immediately  restrict  consideration  to  coherent-state  transmitters  and  coherent- 
detection  or  direct-detection  receivers.  Imposing  these  structural  constraints  leads  to 
Gaussian-noise  (Shannon-type)  capacity  formulas  for  coherent  (homodyne  and  hetero¬ 
dyne)  detection  [22]  and  a  variety  of  Poisson-noise  capacity  results  (depending  on  the 
power  and/or  bandwidth  constraints  that  are  enforced)  for  shot- noise-limited  direct 
detection  [8,  23,  24,  25,  26].  None  of  these  results,  however,  can  be  regarded  as  spec¬ 
ifying  the  ultimate  limit  on  reliable  communication  at  optical  frequencies.  What  is 
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needed  for  deducing  the  fundamental  limits  on  optical  communication  is  the  analog  of 
Shannon’s  Noisy  Channel  Coding  Theorem — free  of  unjustified  structural  constraints 
on  the  transmitter  and  receiver — that  applies  to  transmission  of  classical  information 
over  a  noisy  quantum  channel,  viz.,  the  Holevo-Schumacher- Westmoreland  (HSW) 
Theorem  [27,  28,  29]. 

Until  recently,  little  had  been  done  to  address  the  classical  information  capacity  of 
bosonic  quantum  channels.  As  will  be  seen  below,  the  HSW  Theorem  renders  quan¬ 
tum  measurement  optimization  an  implicit — rather  than  explicit — part  of  capacity 
determination,  and  confronts  a  superadditivity  property  that  is  absent  from  classical 
Shannon  theory.  Prior  to  this  theorem — and  well  after  its  proof — about  the  only 
bosonic  channel  whose  classical  information  capacity  had  been  determined  was  the 
lossless  channel  [30,  31],  in  which  the  held  modes  (with  annihilation  operators  {ay}) 
controlled  by  the  transmitter  are  available  for  measurement  (without  loss,  hence  with¬ 
out  additional  quantum  noise)  at  the  receiver.  This  situation  changed  dramatically 
when  we  obtained  the  capacity  of  the  pure-loss  channel  [7],  i.e.,  one  in  which  pho¬ 
tons  may  be  lost  en  route  from  the  transmitter  to  the  receiver  while  incurring  the 
minimal  additional  quantum  noise  required  to  preserve  the  Heisenberg  uncertainty 
relation.  We  then  considered  active  channel  models — in  which  noise  photons  are 
injected  from  an  external  environment  or  the  signal  is  amplified  with  unavoidable 
quantum  noise — obtaining  upper  and  lower  bounds  on  the  resulting  channel  capaci¬ 
ties,  which  are  asymptotically  tight  at  low  and  high  noise  levels  [9].  [We  conjectured 
that  our  lower  bounds  are  in  fact  the  capacities,  but  we  have  yet  to  prove  that 
assertion.]  Collectively,  the  preceding  channel  models  can  represent  line-of-sight  free- 
space  optical  communications  (see  [7], [9])  and  loss-limited  fiber-optic  communications 
with  or  without  pre-detection  optical  amplification.  Furthermore,  the  classical-noise 
channel — in  which  optical  amplification  is  used  to  balance  the  attenuation  due  to  free- 
space  diffraction  or  fiber  propagation — is  the  quantum  analog  of  Shannon’s  additive 
white-Gaussian-noise  channel,  thus  its  capacity  is  especially  interesting  in  comparison 
to  Shannon’s  well-known  formula. 

For  the  pure-loss  case,  it  turns  out  that  capacity  is  achievable  with  coherent-state 
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(laser  light)  encoding,  but  a  multi-symbol  quantum  measurement  (a  joint  measure¬ 
ment  over  entire  codewords)  is  required.  Heterodyne  detection  is  asymptotically 
optimum  in  the  limit  of  large  average  photon  number  for  single-mode  operation  [7]. 
The  same  is  true  in  the  limit  of  high  average  power  level  for  wideband  operation  over 
the  far-held  free  space  channel  [7], [9].  However,  all  coherent  reception  techniques 
fall  short  of  the  HSW  Theorem  capacity  for  the  pure-loss  channel  in  photon/power 
starved  scenarios  such  as  deep  space  communication.  We  show  later  in  this  chap¬ 
ter  that  at  very  low  photon  numbers  per  mode,  the  direct  detection  receiver  along 
with  a  coherent-state  on-off-keying  modulation  can  achieve  data  rates  very  close  to 
the  ultimate  capacity.  For  these  applications  it  becomes  especially  important  to  find 
practical  ways  to  reap  the  capacity  advantage  that  multi-symbol  quantum  measure¬ 
ment  affords.  In  the  remainder  of  this  chapter  we  review  the  results  we  have  obtained 
so  far,  towards  developing  these  approaches,  and  applying  them,  to  the  thermal-noise 
and  classical-noise  channels,  and  as  well  as  to  broadcast  channels. 

Section  2.2  provides  a  quick  summary  of  bosonic  channel  models  and  the  HSW 
theorem.  Section  2.3  presents  our  capacity  results  for  the  point-to-point  single-mode 
channels.  Section  2.4  then  addresses  multiple  spatio-temporal  modes  of  the  free- 
space  optical  channel  using  Gaussian  apertures,  something  that  is  easily  analyzed 
by  tensoring  up  a  collection  of  single-mode  models.  Finally,  section  2.5  presents 
our  capacity  results  for  modulation  schemes  using  coherent-state  codewords  that  are 
geared  towards  achieving  high  data  rates  at  very  low  input  power  regimes. 


2.2  Bosonic  communication  channels 

We  are  interested  in  the  classical  communication  capacities  of  point-to-point  bosonic 
channels  with  additive  quantum  Gaussian  noise  and  practical  means  for  communicat¬ 
ing  at  rates  approaching  these  capacities.  The  three  main  categories  of  point-to-point 
bosonic  channels  that  we  describe  below  are,  the  lossy  channel,  the  amplifying  chan¬ 
nel,  and  the  classical-noise  channel.  For  each  single-mode  channel,  the  transmitter 
Alice  (A)  sends  out  an  electromagnetic-field  mode  with  annihilation  operator  a  and 
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the  output  is  received  by  the  receiver  Bob  (B),  which  is  another  held  mode  with  an¬ 
nihilation  operator  b.  The  channels  of  interest  are  not  unitary  evolutions,  so  they  are 
all  governed  by  TP  CP  maps  that  relate  their  output  density  operators,  pA ,  to  their 
input  density  operators,  pB . 

2.2.1  The  lossy  channel 

The  TPCP  map  S^(-)  for  the  single-mode  lossy  channel  can  be  derived  from  the 
commutator  preserving  beam  splitter  relation 


b  =  pja  +  PI  -  r)e, 


(2.1) 


in  which  the  annihilation  operator  e  is  associated  with  an  environmental  (noise)  quan¬ 
tum  system  E,  and  0  <  r)  <  1  is  the  channel  transmissivity.  [See  [32]  for  how  this 
single-mode  map  leads  to  the  quantum  version  of  the  Huygens-Fresnel  diffraction  in¬ 
tegral,  and  for  a  quantum  characteristic  function  specification  of  its  associated  TPCP 
map.]  For  the  pure-loss  channel,  the  e  mode  is  in  its  vacuum  state;  for  the  thermal- 
noise  channel  this  mode  is  in  a  thermal  state,  viz.,  an  isotropic-Gaussian  mixture  of 
coherent  states  with  average  photon  number  N  >  0, 


pE  = 


exp(-\p\2/N) 


nN 


\p)(p\d2p. 


(2.2) 


2.2.2  The  amplifying  channel 

The  TPCP  map  Aff(-)  for  the  single-mode  amplifying  channel  can  be  derived  from 
the  commutator-preserving  phase-insensitive  amplifier  relation  [33] 

b  =  \fk  a  +  —  1  e\  (2.3) 


where  e  is  now  the  modal  annihilation  operator  for  the  noise  introduced  by  the  am¬ 
plifier  and  k  >  1  is  the  amplifier  gain.  This  amplifier  injects  the  minimum  possible 
noise  when  the  e-mode  is  in  its  vacuum  state;  in  the  excess-noise  case  this  mode’s 
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density  operator  is  the  isotropic-Gaussian  coherent-state  mixture  (2.2). 

2.2.3  The  classical-noise  channel 

The  classical-noise  channel  can  be  viewed  as  the  cascade  of  a  pure-loss  channel 
followed  by  a  minimum-noise  amplifying  channel  A°K  whose  gain  exactly  compensates 
for  the  loss,  k  =  I/77.  Then,  with  77  =  1/(M  +  1),  we  obtain  the  following  TPCP  map 
for  the  classical- noise  channel, 

PB  =  AM/)  =  J  CXP(J(f/J/>  0(m)pA -fl'MdV.  (2.4) 

where  D(/i)  is  the  displacement  operator,  i.e.,  b  =  a  +  m  where  m  is  a  zero-mean, 
isotropic  Gaussian  noise  with  variance  given  by  (|m|2)  =  M,  so  that  this  channel  is 
the  quantum  version  of  the  additive  white-Gaussian-noise  channel. 

2.3  Point-to-point,  Single-Mode  Channels 

Let  us  begin  with  a  brief  survey  of  recent  work  on  the  capacity  of  the  point-to-point 
single-mode  bosonic  communication  channel,  done  by  various  members  of  our  research 
group  at  MIT,  led  by  Prof.  J.  H.  Shapiro.  The  details  appeared  in  several  published 
articles  (viz.  [10],  [7], [9],  [11],  and  [34]).  The  capacity  of  the  single- mode,  pure-loss 
channel  (2.1),  whose  transmitter  is  constrained  to  use  no  more  than  N  photons  on 
average  in  a  single  use  of  the  channel,  is  given  by 

C  =  g(r}N)  nats/use,  (2.5) 

where 

g(x )  =  (x  +  1)  ln(x  +  1)  —  x  ln(x)  (2.6) 

is  the  Shannon  entropy  of  the  Bose-Einstein  probability  distribution  with  mean  x. 
This  capacity  is  achieved  by  single-use  random  coding  over  coherent  states  using  an 
isotropic  Gaussian  distribution  which  meets  the  bound  on  the  average  number  of 
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transmitted  photons  per  use  of  the  channel.  [Note  that  the  optimality  of  single-use 
encoding  means  that  the  capacity  of  the  single-mode  pure-loss  channel  is  not  super- 
additive.]  This  capacity  exceeds  what  is  achievable  with  homodyne  and  heterodyne 
detection, 

CUm  =  ^  ln(l  +  4r]N)  and  Chet  =  ln(l  +  rjN),  (2.7) 

although  heterodyne  detection  is  asymptotically  optimal  as  N  — »  oo.  The  direct- 
detection  capacity  obtained  by  using  a  coherent-state  encoding  and  photon¬ 
counting  measurement  is  not  known.  C^T  has  been  shown  to  satisfy  [35], 

Cdir  <  \  In (rjN)  +  o(l)  and  jim  (Cdir)  =  \  In (rjN),  (2.8) 

and  so  is  dominated  by  (2.5)  for  ln(r]N)  >  1.  The  best  known  bounds  to  the  direct- 
detection  capacity  have  recently  been  evaluated  by  Martinez  [8] ,  who  has  shown  that 
tight  lower  bounds  (achievable  rates)  to  the  direct-detection  capacity  can  be  obtained 
by  constraining  the  input  distribution  to  be  a  gamma  density  with  parameter  v.  For 
instance,  a  lower  bound  that  is  obtained  with  a  gamma  density  input  distribution 
with  v  —  1  is  given  by 

Cdi,  >  (1  +  r,N) ln(l  +  r,N)  +  f  -  r,Nle,  (2.9) 

J0  1  +  r]N(l  —  u)  Inn 

where  ye  =  0.5772. . .  is  the  Euler’s  constant.  The  best  known  upper  bound  to  the 
direct-detection  capacity  is  given  by  [8]: 

Cdir  <  Q  +  VN  j  In  Q  +  r/ivj  -  r]Nhi(rjN)  -  ^  +  In  ^1  +  -^==i  j  .  (2.10) 


Employing  the  pure-loss  channel’s  optimal  random  code  ensemble  over  the  thermal- 
noise,  amplifying,  and  classical-noise  channels  leads  to  the  following  lower  bounds  on 
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their  channel  capacities: 


g(vN  +  (!  -  v)N)  -  9{( i  -  v)N) 


C>  < 


g(nN  +  (k 


1)(JV +  !))-«((* 


1)(JV+  1)) 


[  g(N  +  M)-g(M) 


thermal-noise  channel 
amplifying  channel 
classical-noise  channel 


(2.H) 

which  was  conjectured  to  be  their  capacities  [9].  The  proof  of  that  conjecture  is  inti¬ 
mately  related  to  the  problem  of  determining  the  minimum  von  Neumann  entropies 
that  can  be  realized  at  the  output  of  these  channels  by  choice  of  their  input  states. 
In  particular,  showing  that  coherent-state  inputs  are  the  entropy-minimizing  input 
states  would  complete  the  proof  of  the  capacity  conjecture  stated  above,  and  lower 
bounds  on  the  minimum  output  entropies  immediately  imply  upper  bounds  on  the 
corresponding  channel  capacities.  So  far,  among  many  other  things,  it  is  known  that 
coherent-state  inputs  lead  to  local  minima  in  the  output  entropies,  and  we  have  a 
suite  of  output-entropy  lower  bounds  for  single-use  encoding  over  the  thermal-noise 
and  classical-noise  channels.  We  also  know  that  coherent-state  inputs  minimize  the 
integer-order  Renyi  output  entropies  [34], [36],  from  which  a  proof  of  our  capacity 
conjecture  would  follow  were  a  rigorous  foundation  available  for  the  replica  method 
of  statistical  mechanics,  see,  e.g.,  [37,  38]  for  recent  classical- communication  appli¬ 
cations  of  the  replica  method.  As  additional  evidence  towards  the  conjecture,  we 
collected  numerical  evidence  supporting  a  stronger  version  of  the  conjecture,  that  the 
output-state  of  the  bosonic  channels  for  a  vacuum-state  input  majorizes  all  other  out¬ 
put  states.  Our  further  quest  into  the  theory  of  bosonic  multiple-user  communication 
has  led  us  to  propose  two  new  conjectures  on  the  minimum  von  Neumann  entropy 
at  the  output  of  bosonic  channels.  Our  three  minimum  output-entropy  conjectures 
are  elaborated  in  Chapter  4.  Proving  conjecture  1  would  prove  the  capacity  of 
the  single-user  bosonic  channel  with  additive  thermal  noise.  Proving  conjecture  2 
would  prove  the  ultimate  capacity  region  of  the  M-user  bosonic  broadcast  channel 
with  vacuum-state  noise.  Proving  conjecture  3  would  prove  the  ultimate  capac¬ 
ity  region  of  the  M-user  bosonic  broadcast  channel  with  additive  thermal  noise.  As 
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evidence  supporting  our  conjectures,  we  prove  the  Wehrl  entropy  versions  of  the  con¬ 
jectures.  Also,  in  the  thesis,  we  will  prove  that  if  we  restrict  our  optimization  only  to 
Gaussian  states,  then  the  minimum  output  entropy  conjectures  2  and  3  are  both  true. 
The  proof  of  the  Gaussian-state  version  of  conjecture  1  appeared  in  [10].  In  Chapter 
5  we  will  report  the  quantum  version  of  the  Entropy  Power  Inequality,  viz.,  the  En¬ 
tropy  Photon- number  Inequality  (EPnl),  and  we  will  show  that  the  minimum  output 
entropy  conjectures  cited  above  can  be  derived  as  simple  special  cases  of  the  EPnl. 
Hence,  proving  the  EPnl  would  immediately  establish  some  key  capacity  results  for 
the  capacities  of  bosonic  communication  channels  [13]. 


2.4  Multiple-Spatial-Mode,  Pure-Loss,  Free-Space 
Channel 

As  an  explicit  example  of  the  mean-energy  constrained,  pure-loss  channel,  we  now 
treat  the  case  of  free-space  optical  communication.  My  SM  thesis  [39]  treated  the 
wideband  pure-loss  channel  with  frequency-independent  loss.  Despite  its  providing 
insight  into  multi-mode  capacity,  this  analysis  does  not  necessarily  pertain  to  a  real¬ 
istic  scenario.  In  [39]  we  also  studied  the  far-held,  scalar  free-space  channel  in  which 
line-of-sight  propagation  of  a  single  polarization  occurs  over  an  L-m-long  path  from 
a  circular  transmitter  pupil  (area  At)  to  a  circular  receiver  pupil  (area  Ar)  with  the 
transmitter  restricted  to  use  frequencies  {u;:0<c<;<u;c-Cc<;o  =  2ttcL/ \/  AtAr  }. 
This  frequency  range  is  the  far-held  power  transfer  regime,  wherein  there  is  only 
a  single  spatial  mode  that  couples  appreciable  power  from  the  transmitter  pupil  to 
the  receiver  pupil,  and  its  transmissivity  at  frequency  oj  is  rj(ou)  =  (oj/ujq)2  -C  1. 
Figure  2-1  shows  the  geometry,  the  power  allocations  versus  frequency  for  hetero¬ 
dyne,  homodyne,  and  optimal  reception,  and  their  corresponding  capacities  versus 
transmitted  power  normalized  by  P$  =  2nhc2L2 /AtAr,  when  only  this  dominant  spa¬ 
tial  mode  is  employed  [7].  Far-held,  free-space  transmissivity  increases  as  oo2,  thus 
high  frequencies  are  used  preferentially  for  this  channel  because  the  transmissivity 
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Figure  2-1:  Capacity  results  for  the  far-held,  free-space,  pure-loss  channel:  (a)  prop¬ 
agation  geometry;  (b)  capacity-achieving  power  allocations  htoN(to)  versus  frequency 
to  for  heterodyne  (dashed  curve),  homodyne  (dotted  curve),  and  optimal  reception 
(solid  curve),  with  toc  and  htoc/r}(toc)  being  used  to  normalize  the  frequency  and  the 
power-spectra  axes,  respectively;  and  (c)  wideband  capacities  of  optimal,  homodyne, 
and  heterodyne  reception  versus  transmitter  power  P,  with  Po  =  2nhc2  L2  /  AtAr  used 
for  the  reference  power. 

advantage  of  high-frequency  photons  more  than  compensates  for  their  higher  energy 
consumption. 

We  also  explored  the  near- held  behavior  of  the  pure-loss  free-space  channel  [40], 
by  employing  the  full  prolate-spheroidal  wave  function  normal-mode  decomposition 
associated  with  the  propagation  geometry  shown  in  Fig.  2-1  (a)  [41,  42],  Near-held 
propagation  at  frequency  to  =  2nc/X  prevails  when  Df  =  AtAr/(XL)2,  the  product 
of  the  transmitter  and  receiver  Fresnel  numbers,  is  much  greater  than  unity.  In  this 
case  there  are  approximately  Df  spatial  modes  with  near-unity  transmissivities,  with 
all  other  modes  affording  insignificant  power  transfer  from  the  transmitter  pupil  to 
the  receiver  pupil. 

We  also  sketched  out  a  general  wideband  capacity  analysis  for  the  free-space  chan¬ 
nel  in  [39] ,  which  applies  when  neither  the  far- held  nor  the  near-held  assumptions  may 
be  made  for  the  entire  channel  spectrum.  At  very  low  frequencies  the  channel  looks 
like  the  far-held  channel  we  analyzed  earlier,  in  which  the  channel  transmissivity 
rj(to)  oc  to2.  So  in  that  region,  we  expect  that  the  optimal  power  allocation  uses  high 
frequency  photons  preferentially,  and  that  the  power  goes  to  zero  at  low  frequencies. 
At  higher  frequencies,  the  channel  is  closer  to  a  lossless  wideband  channel  we  con- 
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sidered  earlier,  for  which  we  know  that  the  optimal  power  allocation  goes  to  zero  at 
very  high  frequencies  [39].  So,  in  the  ultra  wideband  case,  we  would  expect  the  power 
allocation  to  vanish  both  for  very  low  and  very  high  frequencies.  This  intuition  is 
validated  later  in  this  section. 


The  actual  capacity  calculation  for  the  general  wideband  free-space  channel  for  the 
hard  circular-apertures  case  is  difficult  owing  to  the  complicated  nonlinear  dependence 
of  modal  transmissivity  on  center  frequency  of  transmission,  for  which  closed-form 
expressions  are  not  available.  In  [43],  we  took  another  approach  to  the  wideband  ca¬ 
pacity  of  the  pnre-loss  free-space  channel,  by  employing  either  the  Hermite-Gaussian 
(HG)  or  Laguerre- Gaussian  (LG)  mode  sets  that  are  associated  with  the  soft-aperture 
(Gaussian-attenuation  pupil)  version  of  the  Fig.  2-l(a)  propagation  geometry.  Two 
benefits  are  derived  from  this  approach.  First,  closed-form  expressions  become  avail¬ 
able  for  the  modal  transmissivities,  as  opposed  to  the  hard-aperture  case  [Fig.  2-l(a)], 
for  which  numerical  evaluations  or  analytical  approximations  must  be  employed.  Sec¬ 
ond,  the  LG  modes  have  been  the  subject  of  a  great  deal  of  interest,  in  the  quantum 
optics  and  quantum  information  communities  [44] ,  owing  to  their  carrying  orbital  an¬ 
gular  momentum.  Thus  it  was  germane  to  explore  whether  they  conferred  any  special 
advantage  in  regards  to  classical  information  transmission.  As  we  shall  describe,  in 
the  next  subsection,  the  modal  transmissivities  of  the  LG  modes  are  isomorphic  to 
those  of  the  HG  modes.  Inasmuch  as  the  latter  do  not  convey  orbital  angular  momen¬ 
tum,  it  is  clear  that  such  conveyance  is  not  essential  to  capacity-achieving  classical 
communication  over  the  pure-loss  free-space  channel.  After  this,  we  will  compute  the 
classical  capacity  of  the  general  wideband  free-space  channel  with  soft  apertures,  and 
will  describe  the  scheme  for  doing  optimal  power-allocation  across  spatio-temporal 
modes  of  the  quantized  optical  field  to  achieve  the  ultimate  rate  limits  afforded  by 
coherent-state  encoding  with  both  conventional  coherent  detectors  and  that  with  the 
optimum  joint-detection  quantum  measurement. 


43 


2.4.1  Propagation  Model:  Hermite-Gaussian  and  Laguerre- 


Gaussian  Mode  Sets 


In  lieu  of  the  hard-aperture  propagation  geometry  from  Fig.  2-1  (a),  wherein  the 
transmitter  and  receiver  pupils  are  perfectly  transmitting  apertures  within  other¬ 
wise  opaque  planar  screens,  we  now  introduce  the  soft-aperture  propagation  geome¬ 
try  of  Fig.  2-2.  From  the  quantum  version  of  scalar  Fresnel  diffraction  theory  [32], 
we  know  that  it  is  sufficient,  insofar  as  this  propagation  geometry  is  concerned,  to 
identify  a  complete  set  of  monochromatic  spatial  modes,  for  a  single  electromagnetic 
polarization  of  frequency  u  =  2nc/\  =  ck,  that  maintain  their  orthogonality  when 
transmitted  through  this  channel.  The  resulting  input  and  output  mode  sets  consti¬ 
tute  a  singular-value  decomposition  (SVD)  of  the  linear  propagation  kernel  (spatial 
impulse  response)  associated  with  this  geometry,  which  we  will  now  develop. 


Let  Ui(x). ,  for  i  a  2D  vector  in  the  transmitter’s  exit-pupil  plane,  denote  a 
frequency-cu  held  entering  the  transmitter  pupil  that  is  normalized  to  satisfy 


/ d>2  W£)|2  =  L  (2'12) 

After  masking  of  the  held  by  Gaussian  intensity  transmitter  and  receiver  apertures, 
and  undergoing  free-space  Fresnel  diffraction  over  an  L-m-long  path,  the  held  imme¬ 
diately  after  the  receiver  pupil  is  given  by 


u0(x') 


(2.13) 


where 


h(x',x)  =  exp(— |  x'f/rjt) 


, 1 2  /  2,exp{ikL  +  ik\x-x'\2/2L)  2  2 


i\L 


exp(— \x\2 /r£),  (2.14) 


is  the  channel’s  spatial  impulse  response. 
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Figure  2-2:  Propagation  geometry  with  soft  apertures. 


The  singular-value  (normal-mode)  decomposition  of  h(x',x)  is 

OO 

h(x',x )  =  y/rhi<i>m(x')$n(x),  (2-15) 

m=  1 

where 

l>/h>h2>^3>"->0,  (2.16) 

are  the  modal  transmissivities,  {<I>m(x)}  is  a  complete  orthonormal  (CON)  set  of 
functions  (input  modes)  on  the  transmitter’s  exit-pupil  plane,  and  {0m(x')}  is  a  CON 
set  of  functions  (output  modes)  on  the  receiver’s  entrance-pupil  plane.  Physically,  this 
decomposition  implies  that  h(x',x)  can  be  separated  into  a  countably-inhnite  set  of 
parallel  channels  in  which  transmission  of  Ui(x )  =  <f\n(x)  results  in  reception  of 
u0(x')  =  v/i^0m(x/).  Singular-value  decompositions  are  unique  if  their  {r]m}  are 
distinct.  When  degeneracies  exist,  the  SVD  is  not  unique.  In  particular,  a  linear 
combination  of  input  modes  with  the  same  rjrn  value  produces  y /rf^l  times  that  same 
linear  combination  of  the  associated  output  modes  after  propagation  through  h(x ',  x ). 

The  spatial  impulse  response  h(x',x)  has  both  rectangular  and  cylindrical  sym¬ 
metries.  The  Hermite-Gaussian  (HG)  modes  &n>m(x,y)  provide  an  SVD  of  this  chan¬ 
nel  that  has  rectangular  symmetry,  whereas  Laguerre-Gaussian  (LG)  modes  $P)i(r,  6) 
provide  an  alternative  SVD  for  this  channel  with  cylindrical  symmetry.  Even  though 
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the  spatial  forms  of  the  two  sets  of  CON  spatial  modes  are  completely  different,  the 
associated  modal  transmissivities  for  the  HG  and  the  LG  modes  are  respectively  given 
by 


Vq 


1  +  2  D 


f-s/TTW 

2  Df 


(2.17) 


for  q  —  1,  2, . . . .  Df  =  (Arf. / AL){kr2R/ AL)  is  the  product  of  the  transmitter-pupil  and 
receiver-pupil  Fresnel  numbers  for  this  soft-aperture  configuration.  Also,  there  are  q 
spatial  modes  with  transmissivity  r\q.  The  doubly- indexed  HG  modes  ®n,m(x,y)  with 
n+m+1  =  q  span  the  same  eigenspace  as  the  doubly- indexed  LG  modes  <3>P)/(r,  6)  with 
2p+  Kl  + 1  =  Q,  and  hence  are  related  by  a  unitary  transformation.  Channel  capacity, 
when  either  the  HG  or  LG  modes  are  employed  for  information  transmission  depends 
only  on  their  modal  transmissivities.  Hence  owing  to  singular-value  degeneracies, 
the  HG  and  LG  modes  of  the  soft-aperture  free-space  channel  are  equivalent  mode 
sets  as  far  as  channel  capacity  is  concerned.  A  single  frequency-^  photon  in  the  LG 
mode  0)  carries  orbital  angular  momentum  M  directed  along  the  propagation 

(z)  axis,  whereas  that  same  photon  in  the  HG  mode  $n,m(x,y)  carries  no  ^-directed 
orbital  angular  momentum.  The  equivalence  of  the  {r/pj}  and  the  {rjn^m}  then  implies 
that  angular  momentum  does  not  play  a  role  in  determining  the  channel  capacity  for 
classical  information  transmission  over  the  free-space  channel  shown  in  Fig.  2-2. 


2.4.2  Wideband  Capacities  with  Multiple  Spatial  Modes 

In  this  section,  we  shall  address  the  wideband  capacities  that  can  be  achieved  over 
the  pure-loss,  scalar  free-space  channel  shown  in  Fig.  2-2  using  either  heterodyne 
detection,  homodyne  detection,  or  the  optimum  joint-detection  receiver.  We  will 
allow  the  transmitter  to  use  multiple  spatial  modes,  from  either  the  HG  or  LG  mode 
sets,  and  all  frequencies  u>  G  [0,  oo)  subject  to  a  constraint,  P,  on  the  average  power 
in  the  held  entering  the  transmitter’s  exit  pupil.  It  follows  from  our  prior  work  [7,  40] 
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that  the  capacities  we  are  seeking  satisfy, 


OO 


r°°  f  , 

C(P)  =  max  I  —  Cs m(v(^)9,  Nq( ^)), 

Jo  2vr 


(2.18) 


where  the  maximization  is  subject  to  the  average  power  constraint, 


OO 


dc o 


q=l 


2vr 


(2.19) 


and 


,  ^  =  i  1  +  _  ^  +  4(WN2 


(2.20) 


2(o;/o;o)2 

is  the  modal  transmissivity  at  frequency  u  with  g-fold  degeneracy,  with  lvq  =  AcL/rtrR 
being  the  frequency  at  which  Df  =  1.  In  (2.18), 


g(vN), 


for  optimum  reception 


Csm(v,  N)  =  < 


ln(l  +  rjN), 

\  ln(l  +  ArjN), 


for  heterodyne  detection 
for  homodyne  detection 


(2.21) 


are  the  relevant  single-mode  capacities  as  functions  of  the  modal  transmissivity,  rj, 
and  the  average  photon  number,  N,  for  that  mode.  Regardless  of  the  frequency  de¬ 
pendence  of  rj(uj)  the  single- mode  capacity  formulas  for  heterodyne  and  homodyne 
detection  imply  that  their  wideband  multiple-spatial-mode  capacities  bear  the  follow¬ 
ing  relationship, 

Chom(P)  =  ^Chet(4P).  (2.22) 

Thus,  only  two  maximizations  need  to  be  performed,  both  of  which  can  be  done 
via  Lagrange  multipliers,  to  obtain  the  wideband  multiple-spatial-mode  capacities  for 
optimum  reception,  heterodyne  detection,  and  homodyne  detection. 


The  results  we  have  obtained  by  performing  the  preceding  maximizations  are  as 
follows.  The  optimum-reception  capacity  (in  nats/sec)  and  its  associated  optimum 
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modal-power  spectra  are  given  by 


C(P)  = 
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and 
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respectively,  where  c  is  a  Lagrange  multiplier  chosen  to  enforce  the  average  power 
constraint.  The  corresponding  capacity  and  optimum  modal-power  spectra  for  het¬ 
erodyne  detection  are 
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where  /3  is  another  Lagrange  multiplier,  again  chosen  to  enforce  the  average  power 
constraint.  Finally,  the  capacity  and  optimum  power  allocation  for  homodyne  detec¬ 
tion  are  given  by 


’1  ln  ^2Pvov(v)q^j 


(2.27) 


and 
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where  (3  is  a  Lagrange  multiplier,  chosen  to  enforce  the  average  power  constraint 


(2.28) 


2.4.3  Optimum  power  allocation:  water-filling 

The  capacity-achieving  power  spectrum  for  optimal  reception  employs  all  spatial 
modes  and  all  frequencies.  On  the  other  hand,  the  capacity-achieving  power  spec¬ 
tra  for  heterodyne  and  homodyne  detection  are  “water- blling”  allocations,  i.e.,  they 
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fill  spatial-mode/frequency  volumes  above  their  appropriate  noise-to-transmissivity- 
ratio  contours  until  the  average  power  constraint  is  met  (Fig.  2-3).  That  water-filling 
power  allocation  should  be  capacity  achieving  for  these  coherent  detection  cases  is 
hardly  a  surprise,  as  water-filling  power  allocation  has  long  been  known  to  be  opti¬ 
mal  for  additive  Gaussian  noise  channels  [4],  A  consequence  of  water-filling  power 
allocation  is  that  heterodyne  and  homodyne  detection  only  employ  a  finite  number  of 
spatial  modes  to  achieve  their  respective  capacities,  whereas  optimal-reception  capac¬ 
ity  needs  all  spatial  modes.  This  behavior  is  illustrated  in  Fig.  2-4(a)-(c),  where  we 
have  plotted  the  capacity-achieving  power  spectra  for  optimum  reception,  homodyne 
detection,  and  heterodyne  detection  when  P  =  8.12/kUo.  In  this  case,  heterodyne 
detection  uses  1  <  q  <  3  (a  total  of  6  spatial  modes)  with  non-zero  power,  and  ho¬ 
modyne  detection  uses  1  <  q  <  4  (a  total  of  10  spatial  modes)  with  non-zero  power. 
Optimum  reception  uses  all  spatial  modes,  but  we  have  only  plotted  the  spectra  for 
1  <  q  <  6. 

In  Fig.  2-4(d)  we  have  plotted  the  heterodyne  detection,  homodyne  detection, 
and  optimum  reception  capacities  in  bits/sec,  normalized  by  c^o,  versus  the  normal¬ 
ized  power,  P/hjjQ.  Unlike  the  case  seen  in  Fig.  2-l(c)  for  the  wideband  capacities 
of  the  single-spatial-mode,  far-held  pure-loss  channel,  in  which  heterodyne  detection 
outperforms  homodyne  detection  at  high  power  levels,  Fig.  2-4(d)  shows  that  ho¬ 
modyne  detection  is  consistently  better  than  heterodyne  detection  for  the  multiple- 
spatial-mode  scenario.  This  behavior  has  a  simple  physical  explanation.  Consider 
first  the  single-spatial  mode  wideband  capacities.  At  low  power  levels,  when  capac¬ 
ity  is  power  limited,  homodyne  detection  outperforms  heterodyne  detection  because 
at  every  frequency  it  suffers  less  noise.  On  the  other  hand,  at  high  enough  power 
levels  single-spatial  mode  communication  becomes  bandwidth  limited.  In  this  case 
heterodyne  detection’s  factor-of-two  bandwidth  advantage  over  homodyne  detection 
carries  the  day.  Things  are  different  when  multiple  spatial  modes  are  available.  In  this 
case,  increasing  power  never  reaches  bandwidth-limited  operation;  additional,  lower 
transmissivity,  spatial  modes  get  employed  as  the  power  is  increased  so  that  the  noise 
advantage  of  homodyne  detection  continues  to  give  a  higher  channel  capacity  than 
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Figure  2-3:  Visualization  of  the  capacity-achieving  power  allocation  for  the  wideband, 
multiple-spatial-mode,  free-space  channel,  with  coherent-state  encoding  and  hetero¬ 
dyne  detection  as  ‘water-filling’  into  bowl-shaped  steps  of  a  terrace.  The  horizontal 
axis  oo/ujq,  is  a  normalized  frequency;  n  is  the  total  number  of  spatial  modes  used. 
The  vertical  axis  is  (u) / ujq) / r]{uj)q .  Power  starts  ‘filling’  into  this  terrace  starting  from 
the  q  —  1  step,  ft  keeps  spilling  over  to  the  higher  steps  as  input  power  increases. 
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Figure  2-4:  Capacity-achieving  power  spectra  for  wideband,  multiple-spatial-mode 
communication  over  the  scalar,  pure-loss,  free-space  channel  when  P  =  8.12Huq:  (a) 
optimum  reception  uses  all  spatial  modes  although  spectra  are  only  shown  (from  top 
to  bottom)  for  1  <  q  <  6;  (b)  homodyne  detection  uses  10  spatial  modes  with  (from 
top  to  bottom)  1  <  q  <  4;  (c)  heterodyne  detection  uses  6  spatial  modes  with  (from 
top  to  bottom)  1  <  q  <  3.  (d)  Wideband,  multiple-spatial-mode  capacities  (in  bits 
per  second)  for  the  scalar,  pure-loss,  free-space  channel  that  are  realized  with  optimum 
reception  (top  curve),  homodyne  detection  (middle  curve),  and  heterodyne  detection 
(bottom  curve).  The  capacities,  in  bits/sec,  are  normalized  by  uq  =  4 cL/r^n, 
the  frequency  at  which  Df  =  1,  and  plotted  versus  the  average  transmitter  power 
normalized  by  huj^. 
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does  heterodyne  detection. 

Figure  2-4  shows  that  the  wideband  capacity  realized  with  optimum  reception,  on 
the  multiple-spatial-mode  pure-loss  channel,  increasingly  outstrips  that  of  homodyne 
detection  with  increasing  transmitter  power.  This  advantage  indicates  that  joint 
measurements  over  entire  codewords  afford  performance  that  is  unapproachable  with 
homodyne  detection,  which  is  a  single-use  quantum  measurement. 

2.5  Low-power  Coherent-State  Modulation 

We  computed  the  classical  information  capacities  of  the  single-mode  and  wideband 
lossy  bosonic  communication  channels,  using  various  structured  transmitter  encod¬ 
ings  and  receiver  measurements,  in  [39].  Out  of  the  various  modulation  states,  of 
particular  importance  are  the  coherent-state  encoding  techniques,  as  coherent-states 
are  classical  states  of  light  which  can  be  generated  readily  using  lasers.  Moreover, 
we  have  shown  [7]  that  coherent-state  encoding  with  an  isotropic  complex-Gaussian 
prior  density  over  all  coherent  states,  along  with  an  optimum  receiver  measurement, 
achieves  capacity  for  the  pure-loss  bosonic  channel.  Coherent-state  encodings  would 
be  provably  optimum  for  encoding  classical  messages  for  thermal-noise  bosonic  chan¬ 
nels  and  bosonic  broadcast  channels,  if  certain  conjectures  on  the  minimum  output 
entropy  of  bosonic  channel  were  proven  to  be  true  [9,  12].  When  the  transmitter 
is  starved  for  photons,  instead  of  using  the  full-blown  Gaussian  distribution  over  all 
coherent  states,  several  simplified  encoding  techniques  using  a  few  coherent  states 
do  remarkably  well.  These  low-power  coherent-state  based  encoding  schemes  are  the 
subject  of  study  for  this  section. 

2.5.1  On-Off  Keying  (OOK) 

A  common  scheme  for  optical  modulation,  which  has  been  in  use  for  many  years, 
is  On-Off  Keying  (OOK)  using  coherent  states  with  direct  detection  measurement. 
With  direct  detection  (or  photon  counting)  receivers,  the  bosonic  channel,  from  the 
coherent-state  transmitter  to  the  measurement  outcome,  becomes  a  classical  Pois- 
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Figure  2-5:  The  “Z” -channel  model.  The  single-mode  bosonic  channel,  when  used 
with  OOK-modulated  coherent-states  and  photon  number  measurement,  reduces  to 
a  “Z” -channel  when  the  mean  photon  number  constraint  at  the  input  satisfies  N 
1.  The  transition  probability  from  logical  1  (input  coherent  state  |a))  to  logical  0 
(vacuum  state)  is  given  by  e  =  e~v . 


son  channel,  because  of  the  Poisson  statistics  of  the  photon-number  measurement 
on  coherent  states.  This  encoding-decoding  scheme  is  widely  employed  in  real  sys¬ 
tems  because  of  easy  availability  of  coherent-state  modulators,  and  direct-detection 
receivers1. 

OOK  entails  either  sending  a  coherent-state  |a)  or  the  vacuum  state  |0)  in  each 
use  of  the  channel.  Consider  a  single-mode  lossy  bosonic  channel  with  transmissivity 
rj  and  a  mean  photon  number  constraint  N  at  the  input  of  the  channel.  In  the  limit 
of  N  -C  1,  the  bosonic  channel  for  these  encoding  states  reduces  to  a  “Z” -channel 
(Figure  2-5),  wherein,  the  transition  probability  from  logical  1  (input  coherent  state 
|cc))  to  logical  0  (vacuum  state)  is  given  by  e  =  e-r?lQl2.  The  capacity  of  the  channel 
in  bits  per  use  is  given  by 


Cook (??,  N)  =  max  H  (p(  1  -  e~^/p) )  -  pH  ,  (2.29) 


where  H(p )  =  —  plogp—  (1  —p)  log  1  —  p  is  the  binary  Shannon  entropy.  The  channel 
capacity  of  OOK  with  direct-detection  gets  closer  and  closer  to  optimal  capacity  as 
N  — >  0,  as  we  see  in  Figure  2-6.  The  approach  of  the  OOK  capacity  to  the  optimal 
capacity  is  exponentially  slow  as  N  — >  0.  At  n  —  ICO',  Cook  is  about  77.5%  of 
the  ultimate  capacity  g{j]N )  and  the  ratio  Cook/ g(vN)  increases  at  about  0.03  per 


1  Although,  typical  direct-detection  receivers  are  not  signal-shot-noise  limited  photon  counters. 
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Figure  2-6:  This  figure  shows  that  capacity  achieved  using  OOK  modulation  and 
direct-detection  gets  closer  and  closer  to  optimal  capacity  as  N  — >  0.  The  ordinate 
is  the  ratio  of  the  OOK  and  the  ultimate  capacities  in  bits  per  channel  use.  The 
approach  of  the  OOK  capacity  to  the  optimal  capacity  gets  exponentially  slow  as 
— >  0,  as  is  evident  from  the  log-scale  used  for  the  r/N- axis  of  the  graph.  At 
=  10”7,  Cook  is  about  77.5%  of  the  ultimate  capacity  g(r]N). 
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decade  of  decrease  of  N,  at  very  low  values  of  N. 


2.5.2  Binary  Phase-Shift  Keying  (BPSK) 

Another  common  modulation  scheme  using  coherent-state  inputs  is  Binary  Phase- 
Shift  Keying  (BPSK),  in  which  the  input  alphabet  comprises  two  coherent  states  of 
equal  magnitude  that  are  180  degrees  out  of  phase:  (|a),  — |a)}.  With  a  two-element 
quantum  POVM  measurement  that  result  in  symmetric  outcomes  for  the  two  symbol 
states,  the  BPSK  channel  becomes  a  binary  symmetric  channel  (BSC).  With  a  mean 
photon  number  constraint  of  N  at  the  input,  it  is  easy  to  show  that  the  achievable 
capacity  using  the  best  symbol-by-symbol  measurement  at  the  output  (realized  by  a 
sequence  of  Dolinar  receivers  [20])  is  given  by  the  BSC  capacity  formula: 

f  l  —  \/l  -  g— 4);7V  \ 

CBPSK(VN)  =  1  -  -  .  (2.30) 


Comparing  performance  of  BPSK  to  that  of  OOK 

Figure  2-7  compares  classical  communication  rates  achievable  by  OOK  (with  direct 
detection)  and  BPSK  (with  Dolinar  reception)  modulation  schemes,  with  the  rates 
achieved  by  doing  homodyne  or  heterodyne  detection  with  an  input  alphabet  over 
all  coherent  states,  chosen  from  an  isotropic  Gaussian  distribution  of  coherent  states. 
The  ultimate  capacity  is  given  by  g(r)N)  bits  per  channel  use.  Figure  2-7(a)  is  for  low 
N,  whereas  Figure  2-7(b)  compares  the  achievable  rates  at  higher  N.  At  very  low 
mean  photon  number,  OOK  performs  the  best  of  the  conventional  schemes.  In  the  low 
N  regime,  both  the  binary  modulation  schemes,  viz.,  OOK  and  BPSK  perform  better 
than  the  unrestricted  coherent-state  modulation  with  coherent  detection.  In  the  high 
N  regime,  coherent-detection  capacities  outperform  the  binary  schemes,  because  the 
maximum  rate  achievable  using  any  binary  modulation  system  is  1  bit  per  channel 
use. 
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Figure  2-7:  Comparison  of  capacities  (in  bits  per  channel  use)  of  the  single-mode  lossy 
bosonic  channel  achieved  by:  OOK  modulation  with  direct  detection;  (|a),—  |a)}- 
BPSK  modulation  using  coherent-states;  and  homodyne  and  heterodyne  detection 
with  isotropic-Gaussian  random  coding  over  coherent  states.  For  very  low  values  of 
N,  the  average  transmitter  photon  number,  shown  in  (a),  OOK  outperforms  all  but 
the  ultimate  capacity.  At  somewhat  higher  values  of  N,  both  OOK  and  BPSK  are 
better  than  isotropic-Gaussian  random  coding  with  coherent  detection.  In  the  high 
N  regime,  coherent-detection  capacities  outperform  the  binary  schemes,  because,  the 
maximum  rate  achievable  by  the  latter  approaches  cannot  exceed  1  bit  per  channel 
use. 
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Figure  2-8:  This  figure  illustrates  the  gap  between  the  ultimate  BPSK  coherent- 
state  capacity  (Equation  (2.31))  and  the  achievable  rate  using  a  BPSK  coherent-state 
alphabet  and  symbol-by-symbol  “Dolinar  receiver”  measurement  (Equation  (2.30)). 
In  order  to  bridge  the  gap  between  these  two  capacities,  optimal  multi-symbol  joint 
measurement  schemes  must  be  used  at  the  receiver.  All  capacities  are  plotted  in  units 
of  bits  per  channel  use. 
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Ultimate  capacity  using  the  BPSK  alphabet 


The  ultimate  capacity  that  can  be  achieved  using  a  binary  coherent-state  alphabet 
{| a),  |  —  a)},  with  an  average  input-photon- number  constraint  N  can  be  computed 
by  maximizing  the  Holevo  information  for  the  binary  alphabet  over  all  binary  prior 
probability  densities  {p,  1  —  p}.  The  ultimate  capacity  using  the  binary  coherent-state 
alphabet  is  given  by 

( 1  -I-  p~2^\ 

^SK  =  W^ - ).  (2-31) 

Figure  2-8  shows  the  gap  between  the  ultimate  BPSK  capacity  and  the  achievable 
rate  using  a  BPSK  coherent-state  alphabet  and  symbol-by-symbol  Dolinar-receiver 
measurement.  In  order  to  bridge  the  gap  between  these  two  capacities,  optimal  multi¬ 
symbol  joint  measurement  schemes  must  be  used  at  the  receiver.  Some  examples  of 
such  improvement  over  single-symbol  measurement  schemes  (and  implementations 
thereof)  were  worked  out  by  Sasaki  et.  al.,  in  [45,  46].  Recently,  Ishida  et.  al.  worked 
out  best  achievable  rate  regions  for  the  lossy  bosonic  channel  using  various  coherent- 
state  modulation  schemes  [47],  such  as  Quadrature  Phase  Shift  Keying  (QPSK),  and 
Quadrature  Amplitude  Modulation  (QAM). 
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Chapter  3 


Broadcast  and  Wiretap  Channels 

3.1  Background 

A  broadcast  channel  is  the  congregation  of  communication  media  connecting  a  sin¬ 
gle  transmitter  to  two  or  more  receivers.  The  transmitter  encodes  and  sends  out 
information  to  each  receiver  in  a  way  that  each  receiver  can  reliably  decode  its  re¬ 
spective  information.  The  information  sent  out  to  the  receivers  may  be  independent 
or  nested.  The  capacity  region  of  a  broadcast  channel  is  the  set  of  all  rate  M-tuples 
{Ro,  •  •  • ,  Rm-i},  at  which  independent  information  can  be  sent  perfectly  reliably  to 
the  respective  M  receivers  by  using  suitable  encoding  and  decoding  schemes.  The 
classical  discrete- memoryless  broadcast  channel  was  first  studied  by  Cover  [48] ,  whose 
capacity  region  still  remains  an  open  problem.  The  capacity  region  of  a  special  case 
of  the  broadcast  channel,  known  as  the  degraded  broadcast  channel  -  in  which  the 
channel  symbols  received  by  one  of  the  receivers  is  a  stochastically  degraded  version  of 
the  symbols  received  by  the  other  receiver  -  was  conjectured  by  Cover  [48],  and  later 
proved  to  be  achievable  by  Bergmans  [49].  The  converse  to  the  degraded  broadcast 
channel  capacity  theorem  was  established  later  by  Bergmans  [50]  and  Gallager  [51]. 

A  quantum  broadcast  channel  is  a  quantum-mechanical  communication  link  con¬ 
necting  one  transmitter  to  two  or  more  receivers.  Quantum  broadcast  channels,  like 
point-to-point  quantum  communication  channels,  may  be  used  to  send  classical  infor¬ 
mation,  quantum  information,  or  a  combination  thereof.  We  will  restrict  our  attention 
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only  to  the  case  of  classical  information  transmission  over  quantum  broadcast  chan¬ 
nels.  The  transmitter  encodes  information  intended  to  be  sent  to  various  receivers 
into  quantum  states  of  the  transmission  medium,  and  the  receivers  extract  classical 
information  from  received  quantum  states  by  performing  suitable  quantum  measure¬ 
ments.  Even  though  the  capacity  region  of  the  general  quantum  broadcast  channel  is 
still  an  open  problem,  like  its  classical  counterpart,  the  capacity  region  of  the  two-user 
degraded  quantum  broadcast  channel  for  finite-dimensional  Hilbert  spaces  was  found 
by  Yard,  et.  al.  [52].  bosonic  broadcast  channels  constitute  a  special  class  of  quantum 
broadcast  channels  in  which  the  information  is  encoded  into  quantum  states  of  an 
optical-frequency  quantized  electromagnetic  held. 


In  this  chapter,  we  will  show  that  when  coherent-state  encoding  is  employed  in 
conjunction  with  coherent  detection,  the  bosonic  broadcast  channel  is  equivalent  to 
a  classical  degraded  Gaussian  broadcast  channel  whose  capacity  region  is  known, 
and  known  to  be  dual  to  that  of  the  classical  Gaussian  multiple-access  channel  [53]. 
Thus,  under  these  coding  and  detection  assumptions,  the  capacity  region  for  the 
bosonic  broadcast  channel  is  dual  to  that  for  the  bosonic  multiple-access  channel 
(MAC)  with  coherent-state  encoding  and  coherent  detection.  To  treat  more  general 
transmitter  and  receiver  conditions,  we  use  a  limiting  argument  to  apply  the  degraded 
quantum  broadcast-channel  coding  theorem  for  finite-dimensional  state  spaces  [52]  to 
the  infinite- dimensional  bosonic  channel  with  an  average  photon-number  constraint. 
We  first  consider  the  lossless  two-receiver  case  in  which  Alice  (A)  simultaneously 
transmits  to  Bob  (B),  via  the  transmissivity  r)  >  1/2  port  of  a  lossless  beam  splitter, 
and  to  Charlie  (C),  via  that  beam  splitter’s  reflectivity  1  —  rj  <  1/2  port.  Alice  uses 
arbitrary  encoding  with  an  average  photon  number  N,  while  Bob  and  Charlie  employ 
optimum  measurements.  Given  a  conjecture  about  the  minimum  output  entropy  of 
a  lossy  bosonic  channel  is  true  (see  chapter  4),  we  show  that  the  ultimate  capacity 
region  is  achieved  by  a  coherent-state  encoding,  and  is  given  by 

Rb  <  <7(77/3 AO,  Rc  <  <?((!  -  v)N)  -  <?((!  -  77 W),  (3.1) 
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where  g(x)  =  (x  +  1)  log(x  +  1)  —  xlog(a;)  is  the  entropy  of  the  Bose  Einstein  dis¬ 
tribution  with  mean  x,  and  (3  G  [0, 1].  Interestingly,  this  capacity  region  is  not  dual 
to  that  of  the  bosonic  multiple-access  channel  with  coherent-state  encoding  and  op¬ 
timum  measurement  that  was  found  in  [11], 

We  begin  this  chapter  by  reviewing  the  capacity  region  of  the  degraded  classical 
broadcast  channel,  and  we  evaluate  the  capacity  region  of  the  Gaussian  broadcast 
channel  as  an  example.  We  then  present  a  brief  review  of  Yard  et.  al.’s  capacity 
theorem  for  the  degraded  quantum  broadcast  channel  with  two  receivers,  following 
which  we  present  our  generalization  of  their  result  for  an  arbitrary  number  of  re¬ 
ceivers.  Thereafter  we  present  our  results  on  the  classical  information  capacity  of 
the  bosonic  broadcast  channel.  We  first  analyze  the  two-receiver  lossless  case  with 
no  additional  noise  and  that  with  additive  thermal  noise.  We  then  generalize  our 
results  to  the  lossy  broadcast  channel  with  multiple  receivers.  We  compare  the  rate 
regions  obtained  by  using  coherent-state  encoding  for  the  bosonic  broadcast  chan¬ 
nel  with  that  of  the  bosonic  multiple  access  channel  and  we  find  that  a  duality  that 
is  observed  between  capacity  regions  of  the  classical  Gaussian-noise  broadcast  and 
multiple-access  channels  is  not  seen  in  the  quantum  case.  The  chapter  concludes 
with  a  section  on  the  privacy  capacity  of  the  bosonic  wiretap  channel,  which  is  a 
special  kind  of  a  two-receiver  broadcast  channel  in  which  one  of  the  receivers  is  an 
eavesdropper,  while  the  other  is  the  intended  receiver. 


3.2  Classical  Broadcast  Channel 

In  classical  information  theory,  a  two-user  discrete-memoryless  broadcast  channel  is 
modeled  by  a  classical  probability  transition  matrix  Pb,c\a(Pi1\®)i  where  a,  (3,  and 
7  belong  to  Alice’s  (input)  alphabet  A,  and  Bob  and  Charlie’s  (output)  alphabets,  B 
and  C  respectively.  A  broadcast  channel  is  said  to  be  memoryless  if  successive  uses 
of  the  channel  are  independent,  i.e.,  PBn,cn\An{/3n,  7n|cC)  =  YHl=lpB,c\A{PiiliWi)-  M- 
user  broadcast  channels,  for  M  >  2,  are  defined  similarly.  A  ((2nRs  ,2nRc),n)  code 
for  a  two-receiver  broadcast  channel  consists  on  an  encoder 


61 


an  :  TRb  x  2nRc  -*■  Mn, 


(3.2) 


and  two  decoders 


WB  :  Bn  - 

->  tRb 

(3.3) 

A 

o 

3 

1 

2nRc. 

(3.4) 

The  probability  of  error  Pe n  is  the  probability  that  the  overall  decoded  message 
doesn’t  match  with  the  transmitted  message,  i.e., 

pW  =  P{WB{Bn)^WB  OR  Wc(Cn)^Wc), 

where  the  message  (WB,  Wc)  is  assumed  to  be  uniformly  distributed  over  2nRB  x  2nRc . 
A  rate  pair  (RB,  Rc )  is  said  to  be  achievable  for  the  broadcast  channel  if  there  exists 
a  sequence  of  ((2nRB,2 nRc),n)  codes  with  Pe  — >  0  as  n  — >  oo.  The  capacity  region 
of  the  broadcast  channel  is  the  closure  of  the  set  of  all  achievable  rates. 

Although  the  capacity  region  for  general  broadcast  channels  is  still  an  open  prob¬ 
lem,  the  capacity  region  is  known  for  a  special  class  of  broadcast  channels  known 
as  degraded  broadcast  channels.  It  is  often  the  case  that  one  receiver  (say  C)  is 
further  downstream  from  the  first  receiver  (say  B),  so  that  C  always  receives  a  de¬ 
graded  version  of  P’s  message.  When  A  — »  B  — »  C  forms  a  Markov  chain,  i.e., 
when  Pb,c\a{Pi1W)  —  Pb\a{PW)pc\b{i\0)  we  say  that  the  receiver  C  is  a  physically 
degraded  version  of  P,  and  that  A  — >  B  — >  C  is  a  physically  degraded  broadcast  chan¬ 
nel.  The  probabilities  of  error  P(WB(Bn )  ^  WB)  and  P(Wc{Cn )  ^  Wc)  depend  only 
on  the  marginal  distributions  pB\A{P\ot)  and  Pc\b{i\0)  and  not  on  the  joint  distribu¬ 
tion  Pb,c\a{Pi  l\<y).  Thus  we  define  a  weaker  notion  of  degraded  broadcast  channel  — 
a  broadcast  channel  Pb,c\a(P ,  l\a)  is  said  to  be  degraded  (also  known  as  stochastically 
degraded  to  distinguish  from  the  stronger  notion  of  degraded  in  the  Markov  sense), 
if  there  exists  a  distribution  p(y| /3),  such  that 
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(3.5) 


Pc\a{i\oi)  =  ^PB\A{P\a)p(7\P)- 
P 

Such  channels  were  first  studied  by  Cover  [48],  who  conjectured  that  the  capacity 
region  for  Alice  to  send  independent  information  to  Bob  and  Charlie  at  rates  RB  and 
Rc  respectively  over  a  degraded  broadcast  channel1  A  — »  B  — >  C  is  the  convex  hull 
of  the  closure  of  all  ( RBiRq )  satisfying 

Rb  <  I{A-B\T)  (3.6) 

Rc  <  HT-C )  (3.7) 

for  some  joint  distribution  Pt{t)pa\t{o<\t)pb,c\a{Pi  lW),  where  T  is  an  auxiliary  ran¬ 
dom  variable  with  cardinality  |T|  <  min {\X\,  |(V|,  \2\}.  The  achievability  of  the 
above  capacity  result  was  proved  by  Bergmans  [49],  whereas  Gallager  came  up  with 
a  particularly  novel  proof  of  the  converse  [51]. 


3.2.1  Degraded  broadcast  channel  with  M  receivers 

A  formal  proof  of  the  capacity  region  for  a  degraded  discrete  memoryless  broadcast 
channel  with  an  arbitrary  number  of  receivers,  was  done  recently  by  Borade  et.  al. 
[54],  in  which  they  also  proved  bounds  for  capacity  regions  for  general  multiple- level 
broadcast  networks.  Consider  a  discrete  memoryless  broadcast  channel  with  transmit¬ 
ter  Alice  (A)  sending  information  to  M  receivers,  Y0,  bj ,  . . .,  YM_y.  Such  a  channel  is 
completely  specified  by  the  transition  probabilities  py0,...,yM_1|J4(?/o,  •  •  •  ,UM-i\oi).  Let 
us  also  assume  that  the  channel  map  is  stochastically  degraded  (in  the  same  sense  as 
described  in  Eq.  (3.5)),  as  A  — >  Y'0  — >  Y±  — >  . . .  — >  YM_ i;  i.e.,  Y0  being  the  least  noisy 
receiver  and  YM_  j  the  noisiest  receiver.  The  optimal  capacity  region  is  given  by  the 

1In  all  that  follows,  a  degraded  broadcast  channel  A  — >  B  — >  C  will  be  understood  to  mean  a 
stochastically  degraded  channel  (3.5)  with  transmitter  A,  and  receivers  B  and  C. 
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convex  hull  of  all  rate  M-tuples  (Rq,  R±, ... ,  Rm-i)  satisfying 


R0  <  I  (A]  Y0\Ti), 

Rk  <  I(Tk'i  Yk\Tk+i),  for  k  G  {1, . . . ,  M  —  2}, 

Rm-1  <  I(TM-i',YM-i),  (3.8) 

where  Tf,  k  G  — 1}  are  auxiliary  random  variables  such  that  Tm-i  — > 

Tm~ 2  — 1 ►  . . .  — >  Ti  — >  A  forms  a  Markov  chain,  i.e., 

•  •  •  ,n,a)  =  Ptm_!  (tm-i)  I  Pr^pTfc^-ilTfc)  j  PA|Ti(a|Ti). 

\k=M- 1  / 

(3.9) 

The  above  Markov  chain  structure  of  the  auxiliary  random  variables  Tj.,  6  (1, . ..  ,M  —  1} 
has  been  shown  to  be  optimal  [54],  In  a  degraded  broadcast  channel,  messages  in¬ 
tended  for  noisier  receivers  can  always  be  decoded  by  less  noisy  receivers2.  Hence  the 
kth  receiver  actually  receives  M  —  k  messages  at  a  rate  R^  +  ...  +  Rm-i- 


3.2.2  The  Gaussian  broadcast  channel 

A  Gaussian  broadcast  channel  is  one  in  which  each  receiver  receives  the  transmitted 
symbols  corrupted  by  zero-mean  additive  Gaussian  noise  of  a  fixed  noise  variance.  The 
Gaussian  broadcast  channel  is  an  example  of  a  degraded  broadcast  channel  because 
the  channel  can  be  recharacterized  as  a  stochastically  degraded  channel  in  which  the 
noisier  receiver’s  received  symbols  can  be  thought  of  as  being  obtained  from  the  less 
noisy  receiver’s  received  symbols  by  passing  them  through  a  hypothetical  additive 
Gaussian  noise  channel  with  a  noise  variance  equaling  the  difference  of  the  Gaussian 
noise  variances  seen  by  the  two  receivers  (see  Fig.  3-1). 


2For  a  more  detailed  description  of  how  messages  are  encoded  and  decoded  in  a  degraded  broad¬ 
cast  channel  using  superposition  coding,  please  see  [3] . 
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~  X(0,  Na)  Z'b  ~  N( 0,  Nb  -  Na) 


I 


Xa — *•(+) — >yb — •■RB — "Y( 


Figure  3-1:  Classical  additive  Gaussian  noise  broadcast  channel 


The  two-user  Gaussian  broadcast  channel 

The  simplest  case  of  the  Gaussian  broadcast  channel  is  the  scalar  two-receiver  case. 
There  are  two  receivers,  Bob  and  Charlie,  whose  received  symbols  YB  and  Yc  are 
given  in  terms  of  Alice’s  transmitted  symbol  XA  by 


(3.10) 

(3.11) 


Yb  =  XA  +  ZR  and 


Yc  =  XA  +  ZC, 


where  ZA  ~  A/"(0,  Nb)  and  ZB  ~  A/"(0,  Nc)  are  zero-mean  Gaussian  distributed  ran¬ 
dom  variables  with  variances  NB  and  Nc  respectively.  This  channel  can  be  charac¬ 
terized  by  an  equivalent  degraded  channel  as  shown  in  Fig.  3-1. 

Let  us  use  Cq( 7)  to  denote  the  capacity  of  a  memoryless  scalar  additive  white 
Gaussian  channel  (AWGN)  with  signal  to  noise  ratio  (SNR)  7.  It  is  well  known  that, 


(3.12) 


It  is  easily  shown  [3],  that  an  achievable  capacity  region  for  the  Gaussian  broadcast 
channel,  with  signal  power  constraint  -ZT[| AG^|2]  <  N,  can  be  obtained  by  choosing 
both  Pt(t)  and  pA\t(ch  |t)  to  be  Gaussian.  The  resulting  achievable  region  is  given  by, 


(3.13) 


(3.14) 
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for  0  <  (3  <  1.  Bergmans  proved  the  converse  statement  for  the  Gaussian  broadcast 
channel  [50],  thereby  showing  that  the  capacity  region  given  above  is  the  ultimate 
capacity  region  for  the  Gaussian  broadcast  channel.  Using  Bergmans’s  notation3, 

9c(S)  =  ^  In  (2neS)  (3.15) 

to  denote  the  Shannon  entropy  (in  nats)  of  a  Gaussian  random  variable  with  variance 
S,  the  above  two-receiver  Gaussian  broadcast  capacity  region  can  alternatively  be 
expressed  as, 


Rb  <  9c((3N  +  Nb)  —  gc{NB),  (3.16) 

Rc  <  gc{N  +  Nc)  -  gc((3N  +  Nc)  (3.17) 

for  0  <  f3  <  1.  An  example  plot  of  the  capacity  region  of  a  two-user  Gaussian 
broadcast  channel  is  given  in  Fig.  3-2. 


An  example  from  optical  communications 


Let  us  consider  a  special  case  of  the  two-user  Gaussian  broadcast  channel,  in  which 
Bob  and  Charlie  receive  attenuated  versions  of  Alice’s  message  corrupted  by  Gaussian 
noise,  i.e., 


yB  =  VvXA  +  a/1  -  gZB  and 

he  =  \/l  —  rjXA  +  \/rjZc,  (3.18) 


3We  use  a  subscript  (C)  for  Bergman’s  g(-)  function  to  distinguish  it  from  the  function  g(x )  = 
(1  +  x )  ln(l  +  x)  —  x  In  x  —  which  is  the  Shannon  entropy  of  the  Bose-Einstein  probability  mass 
function  with  mean  x  (and  also  the  von  Neumann  entropy  of  the  bosonic  thermal  state  with  mean 
photon-number  x)  —  that  will  be  used  throughout  this  thesis.  We  will  see  later  in  this  chapter,  that 
the  functions  gc{')  and  g(-)  play  analogous  roles  in  defining  classical  capacity  regions  for  the  classical 
Gaussian  broadcast  channel  and  that  of  the  quantum  (bosonic)  broadcast  channel,  respectively.  As 
we  will  see  in  Chapter  5,  the  functions  gc(')  and  g(-)  also  play  analogous  roles  in  defining  the 
(classical)  Entropy  Power  Inequality  (EPI)  and  the  (quantum)  Entropy  Photon-Number  Inequality 
(EPnl). 
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Figure  3-2:  Capacity  region  of  the  classical  additive  Gaussian  noise  broadcast  channel, 
with  an  input  power  constraint  i?[|A/4|2]  <  10,  and  noise  powers  given  by,  Ng  =  2 
and  Nc  =  6.  The  rates  Rb  and  Rc  are  in  nats  per  channel  use. 


where  1/2  <  rj  <  1,  and  ZB  and  Zq  are  independent,  identically  distributed  (i.i.d.) 
A/"(0,  N )  random  variables.  Such  a  channel  model  arises  when  the  transmitter  Alice 
encodes  classical  information  into  the  magnitude  of  the  complex  electromagnetic  field 
of  a  classical  laser  beam  and  the  beam  splits  into  two  through  a  lossless  beam  splitter 
of  transmissivity  ?/,  in  presence  of  an  ambient  thermal  environment  that  is  sufficiently 
strong  that  its  noise  contribution  dominates  over  the  quantum  noise.  Bob  and  Charlie, 
the  two  receivers  receive  their  respective  classical  signals  at  the  two  output  ports  of 
the  beam  splitter  by  performing  optical  homodyne  detection  (see  Fig.  3-3).  Using 
Bergman’s  results,  it  is  not  hard  to  see  that  the  capacity  region  of  this  channel  will 
be  given  by, 

Rb  <  gc{r}PN +  {\~r))N)- gc({l-rj)N),  (3.19) 

Rc  <  0c((l  -rj)N  +  r]N)  -  gc((l  -  r))(3N  +  rjN),  (3.20) 

where  0  <  (3  <  1. 
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Thermal  State  with  mean 
photon-number  per  mode:  Nt 


I 


Homodyne 


Yb  ~  M 


2(1  -t])Nt  + 


}) 


Quantum  noise  contribution 


Yc  ~  Af 


2t)Nt  +  1  \ 

“4  J 


Figure  3-3:  A  broadcast  channel  in  which  the  transmitter  Alice  encodes  information 
into  a  real-valued  a  for  a  classical  electromagnetic  held  (coherent  state  |ct))  and  the 
beam  splits  into  two,  through  a  lossless  beam  splitter  with  transmissivity  T),  in  pres¬ 
ence  of  an  ambient  thermal  environment  with  an  average  of  N t  photons  per  mode. 
Bob  and  Charlie,  the  two  receivers,  receive  their  respective  classical  signals  YB  and  Yc 
at  the  two  output  ports  of  the  beam  splitter  by  performing  optical  homodyne  detec¬ 
tion.  In  the  limit  of  high  noise  (Nt  1),  and  with  the  substitutions  X A  =  a;  a  G  M, 
and  Nt  =  2N,  this  channel  reduces  to  the  broadcast  channel  model  described  by 


(3.18). 
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The  M-receiver  Gaussian  broadcast  channel 


As  an  example  of  the  capacity  region  of  a  degraded  broadcast  channel  with  M  re¬ 
ceivers,  let  us  consider  an  M-receiver  version  of  the  lossy  thermal  noise  optical  channel 
model  from  Eq.  (3.18).  Each  of  the  M  receivers  receive  an  attenuated  version  of  Al¬ 
ice’s  transmitted  message  with  an  additive  zero-mean  Gaussian  noise,  given  by 

Yk  =  \fr\kA  +  yjl  -  r)kZk,  k  e  {0, . . . ,  M  —  1},  (3.21) 

where  the  transmitter  has  a  mean  power  constraint  given  by  E[|A|2]  <  N,  and  Zk 
are  i.i.d.  Gaussian  J\f( 0,  N)  random  variables.  The  optimal  capacity  region  of  the 
Gaussian  broadcast  channel  for  M  receivers  was  first  found  by  Bergmans  [50],  and  is 
given  by 

Rk  <  gc{Vk/3k+iN+(l-r]k)N)-gc{rik/3kN+(l-rik)N),  k  G  {0, . . . ,  M  -  1},  (3.22) 
where, 

0  =  A)  <  /?i  <  . . .  <  Pm- i  <  Pm  =  1-  (3.23) 

3.3  Quantum  Broadcast  Channel 

In  this  section,  we  study  the  classical  information  capacity  of  quantum  broadcast 
channels,  which  are  quantum  channels  from  one  transmitter  to  two  or  more  receivers. 
The  transmitter  encodes  information  intended  to  be  sent  to  various  receivers  into  the 
quantum  states  of  the  transmission  medium,  and  the  receivers  extract  classical  infor¬ 
mation  from  received  quantum  states  by  performing  suitable  quantum  measurements. 
Even  though  the  capacity  region  of  the  general  quantum  broadcast  channel  is  still 
an  open  problem,  like  its  classical  counterpart,  the  capacity  region  of  the  two-user 
degraded  quantum  broadcast  channel  for  finite-dimensional  Hilbert  spaces  was  found 
by  Yard,  et.  al. [52],  We  begin  this  section  by  stating  Yard  et.  al.’s  capacity  theorem, 
and  then  we  prove  its  straightforward  extension  to  the  case  of  an  arbitrary  number 
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of  receivers. 


3.3.1  Quantum  degraded  broadcast  channel  with  two  receivers 

A  quantum  channel  A Ca~b  from  Alice  to  Bob  is  a  trace-preserving  completely  posi¬ 
tive  map  that  maps  Alice’s  single-use  density  operators  pA  to  Bob’s,  pB  =  A rA~B{pA)- 
The  two-user  quantum  broadcast  channel  ATa-bc  is  a  quantum  channel  from  sender 
Alice  (A)  to  two  independent  receivers  Bob  ( B )  and  Charlie  (C).  The  quantum 
channel  from  Alice  to  Bob  is  obtained  by  tracing  out  C  from  the  channel  map,  i.e., 
AAa-b  =  Trc  {Ma-bc),  with  a  similar  definition  for  Ma-c-  We  say  that  a  broadcast 
channel  ATa-bc  is  degraded  if  there  exists  a  degrading  channel  A fj^-c  from  B  to  C  sat¬ 
isfying  MA-c  —  M^c  o  Ma~b-  The  degraded  broadcast  channel  describes  a  physical 
scenario  in  which  for  each  successive  n  uses  of  Ma-bc  Alice  communicates  a  ran¬ 
domly  generated  classical  message  (m,  k )  G  {Wb,  Wc)  to  Bob  and  Charlie,  where  the 
message-sets  Wb  and  Wc  are  sets  of  classical  indices  of  sizes  2uRb  and  2nRc  respec¬ 
tively.  The  messages  (m,  k)  are  assumed  to  be  uniformly  distributed  over  {Wb,  Wc)- 
Because  of  the  degraded  nature  of  the  channel,  Bob  receives  the  entire  message  (m,  k) 
whereas  Charlie  only  receives  the  index  k.  To  convey  these  messages  {m,k),  Alice 
prepares  n-charmel  use  states  that,  after  transmission  through  the  channel,  result  in 
bipartite  conditional  density  matrices  { Pm  k  " } ,  V(m,  k)  G  ( WB,Wc ).  The  quantum 
states  received  by  Bob  and  Charlie,  { pB'‘k }  and  {/3^nfc}  respectively,  can  be  found 
by  tracing  out  the  other  receiver,  viz.,  p^k  =  Ticn  (pm'li''1 ) ,  etc-  A  (2hRb,2 nRc,n,e) 
code  for  this  channel  consists  of  an  encoder 

xn  :  (WB,WC)  ^  An,  (3.24) 

a  positive  operator-valued  measure  (POVM)  {Amfc}  on  Bn  and  a  POVM  {A'fc}  on  Cn 
which  satisfy4 

Tr(^;r(Am*®Ai))>l-£  (3.25) 

4An,  Bn,  and  Cn  are  the  n  channel  use  alphabets  of  Alice,  Bob,  and  Charlie,  with  respective  sizes 
| A” |,  \Bn\,  and  \Cn\. 
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Figure  3-4:  Schematic  diagram  of  the  degraded  single-mode  bosonic  broadcast  chan¬ 


nel.  The  transmitter  Alice  (A)  encodes  her  messages  to  Bob  ( B )  and  Charlie  (C)  in  a 
classical  index  j,  and,  over  n  successive  uses  of  the  channel,  creates  a  bipartite  state 

zs  nnm  iii 

pj  at  the  receivers. 


for  every  ( m,k )  e  {Wb,Wc)-  A  rate-pair  ( Rb,Rc )  is  achievable  if  there  exists  a 
sequence  of  (2ni?s,2 nRc,n,en)  codes  with  en  — >  0.  The  classical  capacity  region  of 
the  broadcast  channel  is  defined  as  the  convex  hull  of  the  closure  of  all  achievable 
rate  pairs  {Rr,  Rc )•  The  classical  capacity  region  of  the  two- user  degraded  quantum 
broadcast  channel  Ma-bc  was  recently  derived  by  Yard  et.  al.  [52],  and  can  be 
expressed  in  terms  of  the  Holevo  information  [27,  28,  29], 


(3.26) 


where  { pj }  is  a  probability  distribution  associated  with  the  density  operators  frj,  and 
S(p )  =  — Tr(/31ogp)  is  the  von  Neumann  entropy  of  the  quantum  state  p.  Because 
X  niay  not  be  additive,  the  rate  region  ( Rr,Rc )  of  the  degraded  broadcast  channel 
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must  be  computed  by  maximizing  over  successive  uses  of  the  channel,  i.e. ,  for  n  uses 


Rb  <  ^PiX{Pj\i,^A-B(pf))/n 


n 


E 


Pi 


S\J2P^ 


pT 


£^(pf) 


hJ 


and 


(3.27) 


Rc  <  x[  Pi,  Pj\i^A-c(pf) )  !n 


n 


s  J2pipi\ipT  ~  J2piS 


1,3 


(3.28) 


where  j  =  ( m,k )  is  a  collective  index  and  the  states  {pf"}  live  in  the  Hilbert  space 
7 i®n  of  n  successive  uses  of  the  broadcast  channel5.  The  probabilities  {p^}  form 
a  distribution  over  an  auxiliary  classical  alphabet  T,  of  size  |T|,  satisfying  \T\  < 
min{|*4|n,  \B\2n  +  \C\2n  —  1}.  The  ultimate  rate-region  is  computed  by  maximizing 
the  region  specified  by  Eqs.  (3.27)  and  (3.28)6,  over  {pi},  {pj\i},  {p/"})  and  n, 
subject  to  the  cardinality  constraint  on  \T\.  Fig.  3-4  illustrates  the  setup  of  the 
two-user  degraded  quantum  channel. 


5Note  that,  as  the  actual  n-channel-use  quantum  states  sent  out  by  Alice  p ^  do  not  appear  in 
the  expressions  for  Rb  or  Rc  in  Eqs.  (3.27)  and  (3.28),  the  quantum  broadcast  channel  (set  up 
to  transmit  classical  information  to  multiple  receivers)  may  be  seen  without  any  ambiguity,  as  a 
cq-broadcast  channel,  in  which  Alice’s  n-use  alphabet  An  is  a  classical  random  variable,  that  takes 
values  on  a  classical  index  set  {j}  over  n  successive  uses  of  the  channel. 

6  An  alternative  notation  used  in  the  literature  —  An  alternative  notation,  widely  used 
in  published  literature  on  quantum  information  theory,  employs  I{A\B)p  =  H{A)p  —  H{A\B)p  to 
denote  the  Holevo  information  between  (classical  or  quantum)  systems  A  and  B  in  a  joint  state  p. 
The  classical  capacity  region  of  the  quantum  degraded  broadcast  channel  expressed  in  this  notation 
closely  resembles  that  of  the  classical  degraded  broadcast  channel.  Consider  a  degraded  broadcast 
channel  with  n-use  conditional  density  matrices  {pf"0"}-  The  capacity  region  for  Alice 

(A)  to  send  information  to  Bob  ( B )  and  Charlie  (C)  at  rates  Rb  and  Rc  respectively  is  the  convex 
hull  of  the  closure  of  all  ( Rb,Rc )  satisfying 


RB<I{An-Bn\T)a/n  (3.29) 

Rc<I{T-,Cn)a/n  (3.30) 

for  some  n  >  1  and  some  pt,A"(7  J )  giving  rise  to  the  state  =  0.  pT{i)pA™\r{j\i)  °n  ■ 
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3.3.2  Quantum  degraded  broadcast  channel  with  M  receivers 


In  this  section,  we  generalize  the  capacity  region  of  the  two-receiver  quantum  de¬ 
graded  broadcast  channel  in  the  previous  section,  to  an  arbitrary  number  of  re¬ 
ceivers.  Using  this  result,  later  in  this  chapter,  we  evaluate  the  capacity  region 
of  the  bosonic  broadcast  channel  with  an  arbitrary  number  of  receivers.  The  M- 
receiver  quantum  broadcast  channel  Ma-y0...ym_]  is  a  quantum  channel  from  a  sender 
Alice  (A)  to  M  independent  receivers  Yq,  . . . ,  Ym-i-  The  quantum  channel  from 
A  to  Y0  is  obtained  by  tracing  out  all  the  other  receivers  from  the  channel  map, 
he.,  Ma-y0  =  Tr ylv..,yM_1  (M'a~y0...ym_1),  with  a  similar  definition  for  A fA-Yk  for 
k  G  {1, . . . ,  M  —  1}.  We  say  that  a  broadcast  channel  Ma-yq...ym_1  is  degraded  if  there 
exists  a  series  of  degrading  channels  A/"^Syt+i  from  Yk  to  Yk+ 1,  for  k  G  (0, . . . ,  M  —  2}, 
satisfying 


A U-Ym-1  -  y 


>Addeg  y  o 

M —3  *  M  — 2 


K-v,  °V,_ 


a-y0- 


(3.31) 


The  M-receiver  degraded  broadcast  channel  (see  Fig.  3-5)  describes  a  physical  sce¬ 
nario  in  which  for  each  successive  n  uses  of  the  channel  Ma~y0...ym_x  Alice  communi¬ 
cates  a  randomly  generated  classical  message  (mo, . . . ,  tum-i)  €  (Wo,  . . . ,  Wm- i)  to 
the  receivers  Y0, . . .,  YM_l,  where  the  message-sets  Wk  are  sets  of  classical  indices  of 
sizes  2nRk,  for  k  G  (0, . . . ,  M  —  1}.  The  messages  (mo, . . . ,  uim- i)  are  assumed  to  be 
independent  and  uniformly  distributed  over  (Wo, . . . ,  Wm-i),  he., 

M- 1  M- 1  i 

Pw0,...,Wm- iK,  •  •  Pwk{mk )  =  (3.32) 

k= 0  k= 0 

Because  of  the  degraded  nature  of  the  channel,  given  that  the  transmission  rates 
are  within  the  capacity  region  and  proper  encoding  and  decoding  is  employed  at 
the  transmitter  and  at  the  receivers,  Y0  can  decode  the  entire  message  M-tuple 
(m0,  •  •  • ,  rriM- i),  Y  can  decode  the  reduced  message  (M  —  l)-tuple  (mi, . . . ,  mM-i), 
and  so  on,  until  the  noisiest  receiver  Ym-i  can  only  decode  the  single  message- 
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Tm-i 


<t-Yo-YZ-i  \ 
\Pj 


Auxiliary  complex-valued  random  variables 


Transmitter 


Degraded  receivers 


Received  state 


Figure  3-5:  This  figure  summarizes  the  setup  of  the  transmitter  and  the  channel 
model  for  the  M- receiver  quantum  degraded  broadcast  channel.  In  each  successive 
n  uses  of  the  channel,  the  transmitter  A  sends  a  randomly  generated  classical  mes¬ 
sage  (mo,  •  •  • ,  mju-i)  €  (Wo, . . . ,  Wm- i)  to  the  M  receivers  Y0, . . .,  Ym- i,  where  the 
message-sets  W&  are  sets  of  classical  indices  of  sizes  2nRk,  for  k  e  {0 , ,M  —  1}. 
The  dashed  arrows  indicate  the  direction  of  degradation,  i.e. ,  Y0  is  the  least  noisy 
receiver,  and  Ym_i  is  the  noisiest  receiver.  In  this  degraded  channel  model,  the 
quantum  state  received  at  the  receiver  Y^,  pYk  can  always  be  reconstructed  from  the 
quantum  state  received  at  the  receiver  Y^,  p*k' ,  for  k!  <  k,  by  passing  jAk'  through 
a  trace-preserving  completely  positive  map  (a  quantum  channel).  For  sending  the 
classical  message  (m0, . . . ,  rriM-i)  =  j,  Alice  chooses  a  n-use  state  (codeword)  pf" 
using  a  prior  distribution  pj\u ,  where  4  denotes  the  complex  values  taken  by  an  aux¬ 
iliary  random  variable  T^.  It  can  be  shown  that,  in  order  to  compute  the  capacity 
region  of  the  quantum  degraded  broadcast  channel,  we  need  to  choose  M  —  1  com¬ 
plex  valued  auxiliary  random  variables  with  a  Markov  structure  as  shown  above,  i.e., 
Tm~ i  — >  Tm~2  Tk  — >  ...  — >  T\  — >  An  is  a  Markov  chain. 
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Yo 


V 

Yk 


V 


,0  -0 


,0  -0 


Ym- 


1 — ► 


(am-1  1 

\  rriM- 1  J 


Degraded  receivers 


Quantum  measurement 
(POVM  elements) 


Decoded  messages  (estimates) 


Figure  3-6:  This  figure  illustrates  the  decoding  end  of  the  M-receiver  quantum  de¬ 
graded  broadcast  channel.  The  decoder  consists  of  a  set  of  measurement  opera¬ 
tors,  described  by  positive  operator-valued  measures  (POVMs)  for  each  receiver; 

{ Am1...mM_1 } j  ■  ■ {A™ii}  on  3V\  34”,  ■  ■  ■,  34f- i"  respectively. 
Because  of  the  degraded  nature  of  the  channel,  if  the  transmission  rates  are  within 
the  capacity  region  and  proper  encoding  and  decoding  are  employed  at  the  transmit¬ 
ter  and  at  the  receivers  respectively,  Y0  can  decode  the  entire  message  M-tuple  to 
obtain  estimates  (m . . .  ,  m°/_1),  Yi  can  decode  the  reduced  message  (M  —  l)-tuple 
to  obtain  its  own  estimates  (m\, . . . ,  and  so  on,  until  the  noisiest  receiver 

Ym-i  can  only  decode  the  single  message-index  rriM- i  to  obtain  an  estimate  m1^ z\. 
Even  though  the  less  noisy  receivers  can  decode  the  messages  of  the  noisier  receivers, 
the  message  m*,  is  intended  to  be  sent  to  receiver  Y*.,  Vfc.  Hence,  when  we  say  that  a 
broadcast  channel  is  operating  at  a  rate  (R0, . . . ,  Rm- i),  we  mean  that  the  message 
mfc  is  reliably  decoded  by  receiver  Yk  at  the  rate  Rk  bits  per  channel  use. 
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index  m-M-i ■  To  convey  the  message-set7  m([7_  1 ,  Alice  prepares  n- channel  use  states 
that,  after  transmission  through  the  channel,  result  in  M- partite  conditional  den¬ 
sity  matrices  Ip  Viiiq7”1  e  Wq7”1.  The  quantum  states  received  by  a 

particular  receiver,  say  Y0l  can  be  found  by  tracing  out  the  other  receivers,  viz. 

Y"  (  YX...Yn  \ 

p  °M_1  =  Tryn  y»_  ( p  M’l iM_1  ),  etc.  Fig.  3-6  illustrates  this  decoding  process. 
mo  11  M  1  V  m0  / 

A  (2ni7°, . . . ,  2nRM~1,n,  e)  code  for  this  channel  consists  of  an  encoder 


xn  :  (W, 


M—l\ 
0  ) 


An: 


(3.33) 


a  set  of  positive  operator-valued  measures  (POVMs) 


A0  A1 

l  J  ’  J  ’ 

. . .,  ,  |  on  3V\  Ti",  •  • respectively,  such  that  the  mean  probability 

of  a  collective  correct  decision  satisfies8 


'M- 1 


Tr  p 


A; 


mk...mM- 1 


>l-e, 


(3.34) 


,  fc=0 


for  Vitlq7  1  G  Wf  7.  A  rate  M-tuplc  (i?0, . . .  is  achievable  if  there  exists  a 

sequence  of  (2niJ°, . . . ,  2hRm~1  ,  n,  e)  codes  with  en  — >  0.  The  classical  capacity  region 
of  the  broadcast  channel  is  defined  as  the  convex  hull  of  the  closure  of  all  achievable 
rate  M-tuples  (R0, . . . ,  Rm- i)-  The  classical  capacity  region  of  the  two-user  degraded 
quantum  broadcast  channel  with  discrete  alphabet  was  derived  by  Yard  et.  al.  [52], 
and  we  used  the  infinite-dimensional  extension  of  Yard  et.  al.’s  capacity  theorem  to 
prove  the  capacity  region  of  the  bosonic  broadcast  channel,  subject  to  the  minimum 
output  entropy  conjecture  2.  The  capacity  region  of  the  degraded  quantum  broadcast 
channel  can  easily  be  extended  to  the  case  of  an  arbitrary  number  M,  of  receivers. 
For  notational  similarity  to  the  capacity  region  of  the  classical  degraded  broadcast 
channel,  we  state  the  capacity  theorem  first,  using  the  shorthand  notation  for  ffolevo 


7From  here  on,  we  use  the  shorthand  notation  m))7-1  to  denote  the  message  M-tuple 
(mo, . . . ,  tom- i).  Similarly,  the  notation  will  be  used  to  denote  the  set  (Wk,  •  •  • ,  Wm- i)- 

We  will  also  use  the  shorthand  notation  for  probability  distributions,  such  as  pwM-i  (m*7”1)  = 

Pw1,...,wM-i(rnh  ■  ■ 

8 A”  and  34"  are  the  n  channel  use  alphabets  of  Alice,  and  the  kth  receiver  Yk  respectively,  with 
respective  sizes  \An\  and  |34"|,  for  k  €  [0, . . . ,  M  —  1]. 
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information  we  introduced  in  footnote  6  earlier  in  this  chapter. 

Theorem  3.1  —  The  capacity  region  of  the  M- receiver  degraded  broadcast  channel 
Na-y0...ym_i  ,  as  defined  in  Eq.  (3.31),  is  given  by 

Ro  <  i/(^n;y0n|T1), 

Rk  <  -I(Tk-Y?\Tk+1)  Vfc  6  {1, . . . ,  M  —  2}, 
n 

Rm-i  <  (3.35) 

n 

where  Tk,  k  E  {1, . . . ,  M  —  1}  form  a  set  of  auxiliary  complex  valued  random  variables 
such  that  Tm~i  —>  TM_2  — >  . . .  — >  77  — »  . . .  — >  7\  — >  is  a  Markov  chain9,  i.e., 

PTm-1j...,Ti,A"(*M-1,  •  •  •  Hi,  j)  =  PTM-i(*M- l)  I  n  PTfc_!|Tfc(4-l|4)  I  P^ITiO'lh)- 

\k=M- 1  / 

(3.36) 

In  order  to  find  the  optimum  capacity  region,  the  above  rate  region  must  be  optimized 
over  the  joint  distribution  Ptm_1,...,Ti,A"(*m-i,  •  •  • ,  *i,  j)-  As  Holevo  information  is  not 
necessarily  additive  (unlike  Shannon  mutual  information),  the  rate  region  must  also 
be  optimized  over  the  codeword  block-length  n.  The  above  Markov  chain  structure  of 
the  auxiliary  random  variables  Tk,  k  £  {1,...,M  —  1}  is  shown  to  be  optimal  in  the 
converse  proof  which  proves  the  optimality  of  the  above  capacity  region  without  as¬ 
suming  any  special  structure  of  the  auxiliary  random  variables.  Also,  note  the  striking 
similarity  of  the  expressions  for  the  capacity  region  given  above,  with  the  capacity 
region  of  the  classical  M-receiver  degraded  broadcast  channel,  given  in  Eqs.  (3.8). 
Holevo  information  takes  place  of  Shannon  mutual  information  in  the  quantum  case, 
and  because  of  superadditivity  of  Holevo  information,  an  additional  regularization 
over  number  of  channel  uses  n ,  is  required. 

Proof  -  The  proof  of  the  achievability  and  converse  to  the  above  capacity  region  is 
a  straightforward  extension  of  Yard  et.  al.’s  two-receiver  degraded  broadcast  channel 
capacity  region.  The  proof,  though  simple,  involves  notational  complexity.  In  order 

9Here,  we  have  used  An  to  denote  a  classical  random  variable  with  a  slight  abuse  of  notation. 
See  footnote  5. 
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to  preserve  the  flow  of  this  chapter,  we  have  omitted  the  formal  proof  of  the  M- 
receiver  quantum  degraded  broadcast  capacity  region  from  this  section,  but  for  the 
sake  of  completeness  and  for  the  more  interested  readers,  we  have  included  the  proof 
(achiev ability  for  M  —  3  with  a  brief  sketch  of  the  general  case,  and  converse  for  the 
general  M- receiver  case)  in  Appendix  B. 


M-receiver  degraded  broadcast  capacity  region  in  the  Holevo  information 
(x(Pi,Pi))  notation 


The  capacity  region  above  can  be  re-cast  in  the  Holcvo-information  notation  that  we 
used  earlier  in  this  chapter  for  the  two-receiver  quantum  broadcast  channel.  For  the 
channel  model  of  the  multiple-user  quantum  degraded  broadcast  channel  we  described 
in  the  section  above  (pictorially  depicted  in  Fig.  3-5),  our  proposed  capacity  region 
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(in  Eqs.  (3.35))  can  alternatively  be  expressed  as10 


Ro  <  y  (pAnpAjlhjipJ0  ) 


u 


n 


^ptAR) 


u 


s  (  ^2pAn\Tl(j\ii)pJ0  )  ~  ^2pAn\Tl{j\ii)S  (p 


1  /  y^\ 

Rk  <  -2_^PTk+1(ik+i)x  (prfc|Tfc+1(4|4+i),P>'  J  ,  Vfc  G  M-  2}, 


*k+i 


n 


y^Prfc+1(4+ i) 


*fc  + 1 


5  )  -  X^TdrM-i(4|4+i)S  (pf  ) 


^/c 


1  /  ^yn 

Rm~  1  <  -X  (  PTM_ !  (*M-l),  PiM-1 


S  Y.  PTm-iAm-Ap^J  -  (PtM-11) 


(3.38) 


,*M-1 


*M -1 


Even  though  the  capacity-region  expressions  above  have  been  written  for  a  discrete 
alphabet,  in  Section  3.4.6,  we  will  generalize  it  to  a  continuous  alphabet  of  quantum 
states  over  an  infinite-dimensional  Hilbert  space,  in  which  case  the  summations  in 
Eqs.  (3.38)  will  be  replaced  by  integrals.  We  will  use  the  infinite-dimensional  extension 
of  this  capacity  theorem  in  the  following  section  to  evaluate  the  capacity  region  of 
the  M-receiver  bosonic  broadcast  channel. 


10In  Fig.  3-5,  we  define  j  =  {mo, . . . ,  TOm-i}  to  be  a  collective  index  for  the  M  messages  that 
Alice  encodes  into  her  n-use  transmitted  codeword  state  pf  ,  and  p-k  is  defined  to  be  the  state 
received  by  Yk  over  n  successive  channel  uses.  We  introduce  more  notation  here  for  conditional 
received  states: 

Pit  ~  Y^Pa^tAWApJ1  , 

3 

Pit  =  2^  PAn\T1U\il)PT1\T2{il\i2)---PTk_1\Tkiik-l\ik)Pjl  (3-37) 
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3.4  Bosonic  Broadcast  Channel 


3.4.1  Channel  model 

The  two-user  noiseless  bosonic  broadcast  channel  A I~a-bc  consists  of  a  collection  of 
spatial  and  temporal  bosonic  modes  at  the  transmitter  (Alice),  that  interact  with  a 
minimal-quantum-noise  environment  and  split  into  two  sets  of  spatio-temporal  modes 
en  route  to  two  independent  receivers  (Bob  and  Charlie).  The  multi- mode  two-user 
bosonic  broadcast  channel  Ma-bc  is  given  by  <S>sRfA3-Bsca,  where  A fAa-Bscs  is  the 
broadcast-channel  map  for  the  stir  mode,  which  can  be  obtained  from  the  Heisenberg 
evolutions 


bs  =  Vv~sbs  +  \/l  -  rjs  es,  and  (3.39) 

cs  =  y/l  -  rjs  as  -  y/rj^es,  (3.40) 

where  {as}  are  Alice’s  modal  annihilation  operators,  and  {bs},  {cs}  are  the  corre¬ 
sponding  modal  annihilation  operators  for  Bob  and  Charlie,  respectively.  The  modal 
transmissivities  {77^}  satisfy  0  <  r]s  <  1,  Vs,  and  the  environment  modes  {es }  are 
in  their  vacuum  states.  We  will  limit  our  treatment  here  to  the  single-mode  bosonic 
broadcast  channel,  as  the  capacity  of  the  multi-mode  channel  can  in  principle  be  ob¬ 
tained  by  summing  up  capacities  of  all  spatio-temporal  modes  and  maximizing  the 
sum  capacity  region  subject  to  an  overall  input-power  budget  using  Lagrange  mul¬ 
tipliers,  cf.  [55],  where  this  was  done  for  the  capacity  of  the  multi-mode  single-user 
lossy  bosonic  channel. 

We  are  interested  in  finding  the  capacity  region  ( Rb ,  Rc )  of  achievable  rate-pairs 
at  which  Alice  can  send  information  to  Bob  and  Charlie,  with  vanishingly  low  prob¬ 
abilities  of  error.  Alice  is  constrained  by  a  mean  photon-number  (power)  constraint 
(a' a)  <  N.  The  principal  result  we  have  for  the  single-mode  bosonic  broadcast  chan¬ 
nel  stems  from  the  fact  that  the  bosonic  broadcast  channel  is  a  degraded  broadcast 
channel,  and  hence  the  capacity  theorem  we  stated  in  the  previous  section  can  be 
adapted  to  this  case  by  extending  the  result  to  infinite- dimensional  Hilbert  spaces. 
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Our  capacity  result  depends  on  a  minimum  output  entropy  conjecture  (dealt  with  in 
detail  in  chapter  4).  Assuming  this  conjecture  to  be  true,  we  prove  in  this  section, 
that  the  ultimate  capacity  region  of  the  single-mode  noiseless  bosonic  broadcast  chan¬ 
nel  (see  Fig.  3-7)  with  a  mean  input  photon-number  constraint  (a) a)  <  N  is  given 
by 


Rb  <  g(v/3N),  and  (3-41) 

Rc  <  g((l-v)N)-g((l-v)(3N),  (3.42) 

for  0  <  (3  <  1,  where  g(x)  =  (l+x)  ln(l+a;)  —  x  ln(x).  We  further  prove,  assuming  the 
validity  of  the  minimum  output  entropy  conjecture,  that  this  rate  region  is  additive 
and  is  achievable  with  single  channel  use  coherent-state  encoding  with  the  following 
Gaussian  prior  and  conditional  distributions: 

Pt(t)  = 

Pa\t(<x\t)  = 

where  T  is  a  complex-valued  auxiliary  classical  random  variable  taking  values  r  G  C, 
and  A  is  a  complex- valued  classical  random  variable  taking  value  aeC  when  Alice 
sends  out  the  single- mode  coherent  state  |a). 

3.4.2  Degraded  broadcast  condition 

Lemma  3.2  —  The  pure-loss  bosonic  broadcast  channel  Ma-bc ,  with  transmissity 
rj  >  1/2,  is  stochastically  equivalent  to  a  degraded  cq-broadcast  channel  A  — >  B  — >  C , 
in  which  the  degrading  channel  from  Bob  to  Charlie  A f^-c  is  another  beam  splitter 
with  transmissivity  rf  =  (1  —  g)/g  (Fig.  3-8). 

Proof  —  Refer  to  Figure  3-8.  The  annihilation  operator  g  corresponds  to  the 
output  of  the  degrading  channel,  which  is  excited  in  a  state  pg.  In  order  to  prove  that 
the  bosonic  broadcast  channel  Ma-bc  is  indeed  equivalent  to  a  degraded  broadcast 
channel,  we  need  to  show  that  the  states  pg  and  pc  are  identical  quantum  states, 


nN 

1 


exp 


N 


and 


nN/3 


exp 


a\ 


N  (3 


(3.43) 

(3.44) 
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(Pe  =  |0)(0|) 


Bob  :  B  {pb} 


b  =  yjrja  +  y/l-rje 
1 


c  =  y/l  -  7]d  -  y/rje 
Charlie  :  C  {pc} 


Figure  3-7:  A  single- mode  noiseless  bosonic  broadcast  channel  with  two  receivers 
J\[a—bci  can  be  envisioned  as  a  beam  splitter  with  transmissivity  rj.  With  r]  >  1/2, 
the  bosonic  broadcast  channel  reduces  to  a  degraded  quantum  broadcast  channel, 
where  Bob  ( B )  is  the  less-noisy  receiver  and  Charlie  (C)  is  the  more  noisy  (degraded) 
receiver. 


(Pe  =  |0)(0|) 

e 


(, Pf  =  |0)<0|) 

/ 


Alice  :  A 


Bob:B 

~+b=  y/ rja  +  sjl-rje 


Charlie  :  C 
g  =  \/r/b  +  y/\  -  iff 
{pA 

/  _  i  -v 

V 


h=y/T~^b-^f 


Figure  3-8:  The  stochastically  degraded  version  of  the  single-mode  bosonic  broadcast 
channel 
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i.e.,  the  classical  statistics  of  the  results  of  measuring  the  states  pg  and  pc  using  any 
POVM,  will  be  exactly  the  same,  provided  rj  >  1/2. 

Let  us  compute  the  antinormally  ordered  characteristic  functions  of  the  states  pc 
and  pg.  We  have 


WO  =  (e-{’£e{£,> 

=  /e“^*\/TWae^v/IW«tWeC*v/^ee-Cv/5?et^ 

=  x^(\/r^c)xie(-v^o 

=  (3-45) 

and 

x7(  0  =  Xa(VvjOxa(V1-v'C) 

=  xpa  {Vnn'OxA  (vV(!  -  v)0 
x  Xa(V1~V'0 

=  xl“(v/V/0e-’,,(1^)l(|2e-(1-',')  'C'2 
=  X^iV^vOe-^2,  (3.46) 

so  that  Xa(0  =  Xa  (C)j  V Pa ■  Inverse  Fourier  transforming  these  characteristic 
functions  thus  yields  the  same  expressions  for  pc  and  pg.  Hence  pg  and  pc  are  identical 
states,  and  the  pure-loss  bosonic  broadcast  channel  Na-bc  is  a  degraded  broadcast 
channel  for  rj  >  1/2. 

3.4.3  Noiseless  bosonic  broadcast  channel  with  two  receivers 

It  is  known  [10,  7,  39]  that  coherent-state  modulation  using  isotropic  Gaussian  prior 
distribution  achieves  the  ultimate  classical  capacity  (maximizes  the  Holevo  informa¬ 
tion)  for  a  single-mode  pure-loss  bosonic  channel.  It  is  also  known  however,  that 
for  quantum  multiple-access  channels,  coherent-state  encodings  are  not  optimal  [11]. 
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So  it  is  not  clear,  at  the  outset,  whether  coherent-state  encoding  will  be  capacity 
achieving  for  the  bosonic  broadcast  channel.  Nevertheless,  it  is  worth  assessing  the 
capacity  region  realized  by  coherent-state  encoding. 

Consider  the  two-user  bosonic  broadcast  channel  A I~a-bc  and  assume  that  Alice 
has  access  to  all  coherent  states  |a)  to  encode  her  information,  with  a  mean  photon- 
number  constraint  (a' a)  <  N.  Bob  and  Charlie  thus  receive  attenuated  versions  of 
the  coherent  states  that  Alice  transmits  at  each  channel  use.  Let  us  introduce  an 
auxiliary  classical  complex-valued  random  variable  T,  and  an  associated  coherent- 
state  alphabet  |r)  and  prior  probability  distribution  Pt(t)-  Alice  transmits  coherent 
states  | a)  with  conditional  probability  pa\t(o:\t).  The  first  step  towards  proving 
that  the  ultimate  capacity  region  of  the  two-user  bosonic  broadcast  channel  is  given 
by  Eqs.  (3.41)  and  (3.42),  is  to  show  that  the  probability  distributions  Pt(j)  and 
Pa\t{<x\t)i  as  giyen  by  Eqs.  (3.43)  and  (3.44),  achieve  these  rates. 

Yard  et  al.’s  capacity  region  in  Equations  (3.27)  and  (3.28)  require  finite-dimensional 
Hilbert  spaces.  Nevertheless,  we  will  use  their  result  for  the  bosonic  broadcast  chan¬ 
nel  which  has  an  infinite-dimensional  state  space,  as  their  result  can  be  extended  to 
infinite-dimensional  state  spaces  by  means  of  a  limiting  argument.11 

Theorem  3.3  —  Assuming  the  truth  of  strong  conjecture  2  (see  Section  4.1),  the 
ultimate  capacity  region  of  the  single-mode  noiseless  bosonic  broadcast  channel  (see 
Fig.  3-7)  with  a  mean  input  photon-number  constraint  (a) a)  <  N  is  given  by 


Rb  <  g{r)fiN ),  and  (3.47) 

Rc  <  g((l-v)N)-g((l-v)0),  (3.48) 


1 1  When  |T|  and  |A|  are  finite,  and  we  are  using  coherent  states,  we  land  up  with  a  finite  number 
of  possible  transmitted  states,  which  leads  to  a  finite  number  of  possible  states  received  by  Bob  and 
Charlie.  To  be  more  explicit,  let  us  limit  the  auxiliary-input  alphabet  (T)  -  and  hence  the  input 
(A)  and  the  output  alphabets  ( B ,  and  C)  -  to  coherent  states  in  the  finite-dimensional  subspace 
spanned  by  the  Fock  states  { |0),  |1), . . . ,  \K)},  where  K  N.  Applying  Yard  et  al.’s  thereom  to  the 
Hilbert  space  spanned  by  these  states  then  gives  us  a  broadcast  channel  capacity  region  that  must 
be  strictly  an  inner  bound  of  the  rate  region  given  by  Eqs.  (3.49)  and  (3.50).  In  the  limit  that  we 
choose  K  sufficiently  large,  (maintaining  the  cardinality  condition  |7j  <  \A\  that  is  required  by  the 
theorem),  clearly  the  rate-region  expressions  given  by  Yard  et.  al.’s  theorem  can  be  brought  to  as 
close  as  we  wish,  to  those  given  by  Eqs.  (3.49)  and  (3.50). 
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for  0  <  (3  <  1,  where  g(x )  =  (1  +  x)  ln(l  +  x)  —  x ln(x).  This  rate  region  is  additive 
and  is  achievable  with  single  channel  use  coherent-state  encoding  with  the  Gaussian 
prior  and  conditional  distributions  given  in  Eqs.  (3.43)  and  (3.44). 


Proof  [Achicvability]  —  Using  the  infinite- dimensional  (continuous- variable)  exten¬ 
sion  of  Eqs.  (3.27)  and  (3.28),  the  n  —  1  rate-region  for  the  bosonic  broadcast  channel 
using  coherent-state  encoding  is  given  by: 


Rb  < 
Rc  < 


J  pt(t)S  ^  j  pa\t{oi\t)\ y/rja)(y/rja\  d2aj  d2r 
s(J Pt(t )pa\t(o'\t ) | \Jl  -  -  ga\  d2ad2r 

Jpt(t)s(^Jpa\t(o:\t )  x 
|  \Jl  —  p  a)(y/ 1  —  rja\  d2ccj  d2r, 


(3.49) 


(3.50) 


where  we  need  to  maximize  the  bounds  for  Rb  and  Rc  over  all  joint  distributions 
Pt{t)pa\t(®\t)  subject  to  (|aj2)  <  N.  Note  that  A  and  T  are  complex-valued  random 
variables,  and  the  second  term  in  the  Rb  bound  (3.27)  vanishes,  because  the  von 
Neumann  entropy  of  a  pure  state  is  zero.  Substituting  Eqs.  (3.43)  and  (3.44)  into 
Eqs.  (3.49)  and  (3.50),  shows  that  the  rate-region  Eqs.  (3.41)  and  (3.42)  is  achievable 
using  single-use  coherent  state  encoding. 

Proof  [Converse]  —  Assume  that  the  rate  pair  (Rb,  Rc)  is  achievable.  Let  {xn(m,  k)}, 
and  POVMs  {Amfc}  and  {A'fc}  comprise  any  (2ni?s,2 nRc,n,e)  code  in  the  achieving 
sequence.  Suppose  that  Bob  and  Charlie  store  their  decoded  messages  in  the  classi¬ 
cal  registers  We  and  Wq.  respectively.  Let  us  use  PwB,Wc(mi  k)  =  PwBi.m)Pwc(^) 
denote  the  joint  probability  mass  function  of  the  independent  message  registers  We 
and  Wc ■  As  ( Rb,Rc )  is  an  achievable  rate-pair,  there  must  exist  e'n  — >  0,  such  that 


nRc  =  H(WC ) 

<  I(Wc]WC)  +  ne’n 

<  x(Pwc(k),pT)  +  ne'n,  (3.51) 
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where  I(Wc',Wc)  =  H(WC)  ~  H(Wc\Wc)  is  the  Shannon  mutual  information,  and 
Pfc  "  =  Em  PwH  (rri)Pm‘k •  The  second  line  follows  from  Fano’s  inequality  and  the  third 
line  follows  from  Holevo’s  bound12.  Similarly,  for  an  e"  — »  0,  we  can  bound  tiRb  as 

nRB  =  H(Wb ) 

<  I{WB\WB)  +  ne " 

<  X(PWB  {m)i  Pm)  +  n6n 

<  J2Pwc(k)x(PwB(,m),p*nk)  +  ne",  (3.52) 

k 

where  the  three  lines  above  follow  from  Fano’s  inequality,  Holevo’s  bound  and  the 
concavity  of  Holevo  information.  In  order  to  prove  the  converse,  we  now  need  to  show 
that  there  exists  a  number  f3  G  [0, 1],  such  that 

Pwc  (k)x(PwB  (m) ,  Pni,k)  <  ngivPN), 

k 

and  x(pwc(k),tiT)  <  ng((l  -  rj)N)  -  ng{(  1  -  77 )f3N). 

From  the  non- negativity  of  the  von  Neumann  entropy  S(p^h fc),  it  follows  that 

^Pwc(k)x(pwB(rn),p^ik)  <  J2Pwc(k)si  ^PwB{m)pZA  , 

k  k  \  m  / 

as  the  second  term  of  the  Holevo  information  above  is  non-negative.  Because  the 
maximum  von  Neumann  entropy  of  a  single-mode  bosonic  state  with  (m)  <  TV  is 
given  by  g(N ),  we  have  that 

n 

0  <  S(pk")  <  ^g{j]Nkj)  <  ng (rjNk)  ,  (3.53) 

3= 1 

_  _  _  _  Qn 

where  Nk  =  J0"=1  n^k:i >  and  ^ki  1S  ^ie  mean  photon  number  of  the  jth  symbol  pk 

12Holevo’s  bound  [27,  28,  29]:  Let  X  be  the  input  alphabet  for  a  channel,  {pi,pi}  the  priors  and 
modulating  states,  {n.,}  be  a  POVM,  and  Y  the  resulting  output  (classical)  alphabet.  The  Shannon 
mutual  information  I(X;Y)  is  upper  bounded  by  the  Holevo  information  x(Pii  Pi) 


of  the  n-syrnbol  codeword  pf",  for  j  e  {1, . . .  ,  n}.  The  last  inequality  above  follows 
because  g(x)  is  concave.  Therefore,  3 pk  G  [0, 1],  V/c  G  Wc,  such  that 

s  (pT)  =  n9  (vPkNk)  ,  (3.54) 

because  g(x)  is  a  monotonically  increasing  function  of  x  >  0.  Because  of  the  degraded 
nature  of  the  channel,  Charlie’s  state  can  be  obtained  as  the  output  of  a  beam  splitter 
whose  input  states  are  Bob’s  state  (coupling  coefficient  rf  =  (1  —  rj)/rj  to  Charlie)  and 
a  vacuum  state  (coupling  coefficient  1  —  rf  to  Charlie).  It  follows,  from  assuming  the 
truth  of  strong  conjecture  2  (see  chapter  4),  that 

S (pT)  >  ng((  1  -  ri)pkNk)  .  (3.55) 

N  is  the  average  number  of  photons  per-use  at  the  transmitter  (Alice)  averaged  over 
the  entire  codebook.  Thus,  the  mean  photon-number  of  the  n-use  average  codeword 
at  Bob,  pR"  =  YlkPwcifyPk™ >  is  V^-  Hence, 

0  <  ^Pwc{k)S(pffn)  <  S(pBn)  <  ng  (- gN )  ,  (3.56) 

k 

where  the  second  inequality  follows  from  the  concavity  of  von  Neumann  entropy,  and 
the  third  inequality  arises  from  maximizing  the  entropy  subject  to  the  average  photon 
number  constraint.  The  monotonicity  of  g(x)  then  implies  that  there  is  a  (3  G  [0, 1], 
such  that  J2kPwc(k)S (Pk")  =  n9(vP^ 0-  Hence  we  have, 

^2pwc(k)x(PwB(™),  Pm,k)  <ng{r]PN).  (3.57) 

k 

for  some  P  G  [0, 1].  Equation  (3.54),  and  the  uniform  distribution  pwc{k)  =  l/2nRc 
imply  that 

t9PkNk)  =  g  (rjPN)  .  (3.58) 

k 

Using  (3.58),  the  concavity  of  g(x),  and  g  >  1/2,  we  have  shown  (proof  in  Appendix  C) 
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that 


E  ft1  -  >  9  ((1  -  9)/3Af)  .  (3.59) 

k 

From  Eq.  (3.59),  and  Eq.  (3.55)  summed  over  k,  we  then  obtain 

T,Pwc(k)S  ( pT)  >  ng((l  -  rj)PN).  (3.60) 

k 

Finally,  writing  Charlie’s  Holevo  information  as 

x(Pwc(k),Pkn)  =  s  ( ^Pwcik)Pk  )  -  J2Pwc(k)S  (pT) 

\  k  )  k 

<  ng((l -  p)N)  -  ^2pWc(k)S  (p%n)  ,  (3.61) 

k 

we  can  use  Eq.  (3.60)  to  get 


x(Pwc(k),Pkn )  <  ng((  1  -  g)N)  -  ng{{  1  -  v)f3N),  (3.62) 

which  completes  the  proof.  The  capacity  region  is  additive,  because  the  achievability 
part  of  the  proof  above  shows  that  a  product  distribution  over  single-use  coherent- 
state  alphabet  achieves  the  rate  region. 

3.4.4  Achievable  rate  region  using  coherent  detection  receivers 

Unless  we  have  a  proof  of  strong  conjecture  2,  we  cannot  assert  that  Eqs.  (3.41) 
and  (3.42)  define  the  capacity  region  of  the  two-user  bosonic  broadcast  channel.  How¬ 
ever,  because  the  rate  region  specified  by  these  equations  is  achievable  with  single-use 
coherent-state  encoding,  we  know  that  they  comprise  an  inner  bound  on  the  ultimate 
capacity  region.  In  this  regard,  it  is  instructive  to  examine  how  the  rate  region  de¬ 
fined  by  Eqs.  (3.41)  and  (3.42)  compares  with  what  can  be  realized  by  conventional, 
coherent  detection  schemes  used  in  optical  communications. 

Suppose  Alice  sends  a  coherent  state  |a),  into  the  channel  in  Fig.  3-7.  Bob  and 
Charlie  will  then  receive  coherent  states  |  y/fjot)  and  |\/l  —  r/a),  respectively.  More- 
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over,  if  Bob  and  Charlie  employ  homodyne-detection  receivers,  with  local  oscilla¬ 
tor  phases  set  to  observe  the  real  quadrature,  their  results  of  measurement  will  be 
y/rj$t(a)  +  vb  for  Bob  and  a/1  —  rj^Si(a)  +  vc  for  Charlie,  where  vb  and  vc  are  inde¬ 
pendent,  identically  distributed,  real- valued  Gaussian  random  variables  with  variance 
1/4  [18].  Similarly,  if  Bob  and  Charlie  employ  heterodyne-detection  receivers,  their 
results  of  measurement  will  be  y/rja  +  zb  and  a/1  —  rja  +  zc,  where  zb  and  zc  are  in¬ 
dependent,  identically  distributed  complex- valued  zero-mean  Gaussian  random  vari¬ 
ables  with  variance  1/2  [18].  These  results  imply  that  the  rj  >  1/2  bosonic  broadcast 
channel  with  coherent-state  encoding  and  homodyne  detection  is  a  classical  degraded 
scalar-Gaussian  broadcast  channel,  whose  capacity  region  is  known  to  be  [3] 


Rb  < 

Rc  < 


^ln  (l  +  4 r]/3N) 

2  V  1  +  4(1  -  r,)(3N  ) 


(3.63) 

(3.64) 


for  0  <  f3  <  1.  Similarly,  the  rj  >  1/2  bosonic  broadcast  channel  with  coherent-state 
encoding  and  heterodyne  detection  is  a  classical  degraded  vector-Gaussian  broadcast 
channel,  whose  capacity  region  is  known  to  be 


Rb 

Rc 


<  In  (l  +  rj/3 N) 

(1  —  »)(1—  P)N\ 


<  In  1  + 


1  +  (1  -  r))(3N  J  ’ 


(3.65) 

(3.66) 


for  0  <  (3  <  1.  In  Fig.  3-9  we  compare  the  capacity  regions  attained  by  a  coherent- 
state  input  alphabet  using  homodyne,  heterodyne,  and  optimum  reception.  As  is 
known  for  single-user  bosonic  communication,  homodyne  detection  performs  better 
than  heterodyne  detection  when  the  transmitters  are  starved  for  photons,  because 
it  has  lower  noise.  Conversely,  heterodyne  detection  outperforms  homodyne  detec¬ 
tion  when  the  transmitters  are  photon  rich,  because  it  has  a  factor-of-two  bandwidth 
advantage  over  homodyne  detection.  In  order  to  bridge  the  gap  between  the  coherent- 
detection  capacity  regions  and  the  ultimate  capacity  region,  one  must  use  joint  detec¬ 
tion  over  long  codewords.  Future  investigation  will  be  needed  to  develop  receivers  that 


Rb 
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Figure  3-9:  Comparison  of  bosonic  broadcast  channel  capacity  regions,  in  bits  per 
channel  use,  achieved  by  coherent-state  encoding  using  homodyne  detection  (the  ca¬ 
pacity  region  lies  inside  the  boundary  marked  by  circles),  heterodyne  detection  (the 
capacity  region  lies  inside  the  boundary  marked  by  dashes),  and  optimum  reception 
(the  capacity  region  lies  inside  the  boundary  marked  by  the  solid  curve),  for  77  =  0.8, 
and  N  —  1,5,  and  15. 

can  approach  the  ultimate  communication  rates  over  the  bosonic  broadcast  channel. 

3.4.5  Thermal-noise  bosonic  broadcast  channel  with  two  re¬ 


ceivers 


Now  assume  that  the  environment  mode  e  in  the  bosonic  broadcast  channel  in  Fig.  3- 
7)  is  in  a  zero- mean  thermal  state  with  mean  photon  number  N  (see  Fig.  3-10),  i.e. , 


(3.67) 


Theorem  3.4  —  Provided  the  minimum  output  entropy  conjectures  strong  conjec¬ 
ture  1  and  strong  conjecture  3  (see  Section  4.1)  are  true,  the  capacity  region  for  the 
bosonic  broadcast  channel  with  additive  thermal  noise,  with  mean  photon  number 
constraint  N  at  the  input  and  an  additive  zero-mean  thermal  noise  with  N  photons 
per  mode,  on  average,  is  given  by, 


Rb  <  g{vflN  +  (1  —  v)N)  —  g((l  —  r))N) 

Rc  <  g((l-rj)N +  gN)- g((l-rj)(3N +  gN), 


(3.68) 

(3.69) 
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(  Pe  =  PT,N  =  f  e  lM|2/JV|/i)(/iM2/i^ 

e 


Bob  :  B  {pb} _ 

s/rja  +  x/l-r/e 


c  =  a/1  -  r/a  - 
Charlie  :  C  {pc} 


Figure  3-10:  A  single-mode  noiseless  bosonic  broadcast  channel  with  two  receivers 
J^A-BCi  with  additive  thermal  noise.  The  transmitter  Alice  (A)  is  constrained  to  use 
N  photons  per  use  of  the  channel,  and  the  noise  (environment)  mode  is  in  a  zero- 
mean  thermal  state  Pt,n,  with  mean  photon  number  N.  With  r]  >  1/2,  the  bosonic 
broadcast  channel  reduces  to  a  degraded  quantum  broadcast  channel,  where  Bob  ( B ) 
is  the  less-noisy  receiver  and  Charlie  ( C )  is  the  more  noisy  (degraded)  receiver.  See 
the  degraded  version  of  the  channel  in  Fig.  3-11. 


and  capacity  is  achieved  using  product-coherent-state  encoding  with  a  Gaussian  prior 
density  as  in  the  case  of  the  noiseless  bosonic  broadcast  channel13. 

Proof  [Achievability]  —  It  can  be  readily  verified  that  the  degraded  broadcast  con¬ 
dition  still  holds  for  the  case  of  the  bosonic  broadcast  channel  with  additive  thermal 
noise  (See  Fig.  3-11).  We  generalize  Yard  et.  al.’s  rate  regions  for  degraded  quantum 
broadcast  channels,  from  Eqs.  (3.27)  and  (3.28),  to  the  case  of  the  bosonic  broadcast 
channel  with  coherent-state  encoding  and  additive  thermal  noise  in  a  similar  way  to 


13Note  the  striking  similarity  between  the  expressions  for  the  rate  region  for  the  classical  Gaussian- 
noise  broadcast  channel  as  given  in  Eqs.  (3.19)  and  (3.20)  and  that  for  the  rate  region  of  the  bosonic 
thermal-noise  broadcast  channel  as  we  propose  above  in  Eqs.  (3.68)  and  (3.69).  The  expressions  for 
these  two  rate  regions  are  exactly  identical  except  for  the  fact  that  the  logarithmic  function  gc(')  is 
replaced  by  the  bosonic  thermal-state  entropy  function  g(-)  in  the  quantum  case.  We  will  repeatedly 
encounter  in  this  thesis  instances  of  this  analogous  role  that  g(-)  plays  in  the  bosonic  case,  which  the 
logarithmic  function  gc{')  does  in  the  classical  Gaussian  case.  The  observation  of  this  analogy  was 
one  of  the  key  initial  hints  that  led  us  to  conjecture  the  Entropy  Photon-number  Inequality  (EPnl) 
[13]  in  analogy  with  the  Entropy  Power  Inequality  (EPI)  of  classical  information  theory.  The  EPnl 
subsumes  all  the  three  minimum  output  entropy  conjectures  that  we  describe  in  chapter  4.  We  will 
talk  about  the  EPnl  in  detail  in  Chapter  5  of  this  thesis,  where  we  will  see  why  the  existence  of  a 
simple  inverse  of  gc(')  (i-e. ,  the  exp(-)-fiinction)  makes  it  a  great  deal  easier  to  prove  the  EPI  as 
opposed  to  the  EPnl  (whose  general  proof  is  still  an  open  problem),  because  the  inverse  function  of 
g{-)  doesn’t  admit  a  nice  analytic  form. 
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Alice  :  A 


Charlie  :  C 

►  9  =  +  \A-*?7 

{PA 

/  1  -v 


c  =  y/l  -  rja  -  y/rje 
{pA 


r\  = 


T] 


h  =  \Jl  -  rj'b  -  yfiyf 


Figure  3-11:  The  stochastically  degraded  version  of  the  single-mode  bosonic  broadcast 
channel  with  additive  thermal  noise. 


what  we  did  for  the  noiseless  Broadcast  channel14: 


Rb  <  J Pt(t)S  ^ I pA\T{o\t)  J  e"^^|7)(7|d27^  d2a^  d2r 

-  J  J PT(r)pA]T(a\T)S  J  e_l7(1“^  |7)(7|d27^)d2ad2r  (3.70) 

Rc  <  S^J  Pt{t)pa\t{oi\t)  J  e"'7  ^  '  I7X7MV)  d2ad2r^ 

-  J  Pt(t)S  Pa\t{o.\t )  J  e~h  ^  d2a^  d2r  (3.71) 


where,  in  order  to  get  the  n  =  1  capacity  region,  we  need  to  maximize  the  bounds 
for  Rb  and  Rc  over  all  complex- valued  joint  distributions  Pt('p)pa\t{^\t)  subject 
to  (|a|2)  <  N.  Note  that  A  and  T  are  two  complex-valued  random  variables,  and 
the  second  term  in  the  bound  for  RB  (see  Equation  (3.27))  is  non-zero,  because 
the  conditional  output  states  at  the  two  receivers  are  now  mixed  states  in  general. 
Substituting  the  distributions  from  Eqs.  (3.43),  and  (3.44)  into  the  expressions  for 


14Let  us  limit  the  auxiliary-input  alphabet  (T)  to  coherent  states  in  the  finite-dimensional  subspace 
spanned  by  the  Fock  states  { |0) ,  1 1) , . . . ,  |/\i)},  and  limit  the  thermal-noise  state  pe  to  the  span  of 
{ 1 0) ,  1 1) , . . . ,  | K2)},  such  that  Kx  +  I\2  N  +  N.  Applying  Yard  et  al.’s  thereom  to  the  Hilbert 
space  spanned  by  these  states  then  gives  us  a  broadcast  channel  capacity  region  that  must  be  strictly 
an  inner  bound  of  the  rate  region  given  by  Eqs.  (3.70)  and  (3.71).  In  the  limit  in  which  we  choose 
K\  and  K2  sufficiently  large,  (maintaining  the  cardinality  condition  |T|  <  |A|  that  is  required  by 
the  theorem),  the  rate-region  expressions  given  by  Yard  et.  al.’s  theorem  can  be  brought  to  as  close 
as  we  wish  to  that  given  by  Eqs.  (3.70)  and  (3.71). 
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the  rate-bounds  in  Eqs.  (3.70)  and  (3.71),  and  using  the  fact  that  the  von  Neumann 
entropy  of  a  thermal  state  with  mean  photon- number  N  is  equal  to  g{N),  we  obtain 
the  rate-bounds  in  the  capacity  theorem  above.  It  follows  that  the  rate  region  (3.68), 
(3.69)  is  achievable. 

Proof  [Converse]  —  Assume  that  the  rate  pair  (Rb,  Rc)  is  achievable.  Let  us  begin 
with  the  same  initial  steps  as  in  the  proof  of  the  converse  of  the  capacity  theorem  for 
the  noiseless  bosonic  broadcast  channel.  Equations  (3.51)  and  (3.52)  still  hold.  Thus, 
in  order  to  prove  the  converse  for  the  thermal  noise  broadcast  channel,  we  now  need 
to  show  that  there  exists  a  number  (3  G  [0, 1],  such  that 

^2pwc(k)x(PwB(m),  p^k)  <  ng{r]PN  +  (1  -  rj)N)  -  ng((  1  -  rj)N),  (3.72) 

k 

x(Pwc(k),  p%n)  <  ng((  1  -  rj)N  +  rjN)  -  ng{(  1  -  rj)(3N  +  rjN).  (3.73) 

Assuming  the  truth  of  strong  conjecture  1  (see  chapter  4),  the  minimum  entropy  of 
Bob’s  n- mode  state  is  achieved  when  Alice  sends  a  product  of  vacuum  states  (or  a 
product  of  arbitrary  coherent  states).  Thus  using  strong  conjecture  1  we  have  for  all 
(m,k)  E  (WB,Wc), 


S(pm,k )  >  ng((l  ~  V)N). 

(3.74) 

From  the  non- negativity  of  Holevo  information  x(pwB  (m)i  Pm  'k, 

),  it  follows  that15 

S(pD  >  J2PWB^S(Pm,k) 

(3.75) 

771 

>  ng{(l-rj)N). 

(3.76) 

_  _  _  _  j^n 

Let  N k  =  yA^^,  where  N^_  is  the  mean  photon  number  of  the  jth  symbol  pk  3  of 


15From  the  definition  of  Holevo  information,  we  have 

x(pwb (m)> Prn,k)  =  S(52PwB(m)pm,k)  -'52PwB{'rn)S(Pm,k) 

m  m 

=  s(Pk")  -E^ws(^) 

m 

>  o. 


93 


the  n-symbol  codeword  pfn,  for  j  G  {1, . . . ,  n}.  Similarly,  let  NB  =  ]U  J= 1  yA^,  where 
N B  is  the  mean  photon  number  of  the  jth  symbol  pk  1  of  the  n-symbol  codeword  pk" , 
for  j  G  {1, . . .  ,n}.  The  overall  mean  photon  numbers  per  channel  use  for  Alice  and 
Bob  are  thus  given  by  an  average  over  the  codebook  Wc,  he.,  N  =  2~nUc  XJk=i  ^k  > 
and  Na  =  2~nUc  Yll-i  ■  From  the  input-output  relation  of  the  channel,  the 
following  must  hold: 


NB 

=  +  (1  - 

Vh,j 

(3.77) 

NkB 

=  r/N*  +  (1  -  n)N, 

\/k,  and 

(3.78) 

Nb 

=  pN  +  (1  —  r))N. 

(3.79) 

Using  Eq.  (3.76),  the  fact  that  the  maximum  von  Neumann  entropy  of  a  single-mode 
bosonic  state  with  mean  photon  number  N  is  given  by  g(N),  and  the  concavity  of 
g(x),  we  have 

n 

ng((  1  -  v)N)  <  S  (pfn)  <  9  (n£)  <  ng{Nk  )  =  ng  (■ rjN £  +  (1  -  g)N)  .  (3.80) 

3= 1 

Therefore  given  the  monotonicity  of  the  r/(x)-function,  3 (3k  G  [0,1],  \/k  G  Wc,  such 
that 

S  (pD  =  n9  t*lPkNk  +  (1  -  rj)N)  .  (3.81) 

The  average  number  of  photons  per  use  at  the  transmitter  (Alice)  averaged  over  the 
entire  codebook  (Wb,  Wc),  is  N.  Thus,  the  mean  photon-number  of  the  n-use  average 
codeword  for  Bob,  pB'1  =  J2kPwc(.k)Pk'\  is  +  (1  —  rj)N .  Hence, 

ng((  1  -  v)N)  <  ^2pWc(k)S  (pf")  <  S(pBn)  <  ng  (: rjN  +  (1  -  rj)N )  ,  (3.82) 

k 

where  the  hrst  inequality  assumes  strong  conjecture  1  and  the  second  inequality  fol¬ 
lows  from  the  concavity  of  von  Neumann  entropy.  The  monotonicity  of  g(x)  then 
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implies  that  there  is  a  (3  G  [0, 1],  such  that 


^pwc(k)S  (pf)  =  ng{j](5N  +  (1  -  rj)N).  (3.83) 

k 

We  thus  have, 

J2Pwc(k)x(PwB(m),  pf£k) 
k 

=  Pwc  (k)S  (  PwB  ( m)Pm,kj  ~^^Pwc  ( k)PwB  (m)S(p^k)  (3-84) 
=  ^PWc(k)S  (pf)  -  J2Pwc(k)PwB(m)S(p^k)  (3.85) 

k  km 

<  ng(rj/3N  +  (1  —  g)N)  —  ng((  1  —  g)N).  (3.86) 

where  the  last  inequality  follows  from  Eqs.  (3.83)  and  (3.74).  This  completes  the  first 
part  of  the  converse  proof,  i.e.,  inequality  (3.72). 

Because  of  the  degraded  nature  of  the  channel,  Charlie’s  state  can  be  obtained  as  the 
output  of  a  beam  splitter  of  transmissivity  rj'  =  (1  —  rf) / r/,  whose  input  states  are 
Bob’s  state  and  a  thermal  state  of  mean  photon  number  N  (See  Fig.  3-11).  It  follows, 
from  assuming  the  truth  of  strong  conjecture  3  (see  chapter  4),  that 

S(i>T)  >  »9p'(»AiV^  +  (l-r,)iV)  +  (l-,')iV)  (3.87) 

=  ng((  1  -  ri)f)kNj?  +  g N).  (3.88) 

Equations  (3.81),  (3.83),  and  the  uniform  distribution  Pwc(k )  =  l/2ni?c  imply  that 

V  (gPtN£  +  (1  -  „)JV)  =  g  (gPN  +  (1  -  g)N) .  (3.89) 

k 

Using  (3.89),  the  concavity  of  ^(x)-function,  and  rj  >  1/2,  we  have  shown  (proof  in 
Appendix  C)  that 

E  PS/S  ((1  -  o)A N?  +  riN)  >g((l-  g)PN  +  gN) .  (3.90) 

k 
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From  Eq.  (3.90),  and  (3.88)  summed  over  k,  we  then  obtain 


2><,(*>S  (pf)  >  nj((l  -  >))/W  +  i,JV).  (3.91) 

k 

Finally,  we  bound  Charlie’s  Holevo  information  using  the  standard  maximum  entropy 
bound  with  a  mean  photon  number  constraint  and  Eq.  (3.91),  which  yields: 

x(pwc(k),tfT)  =  s  \J2Pwc(k)Pk  )  -  ^2pwc(k)s  (pkn) 

\  k  /  k 

<  ng((  1  —  rj)N  +  rjN)  —  ng((  1  —  r/)/L/V  +  77.ZV),  (3.92) 

completing  the  proof  of  the  second  piece  of  the  converse,  i.e.,  that  of  inequality  (3.73). 
The  capacity  region  is  additive,  because  the  achievability  part  of  the  proof  above 
shows  that  a  product  distribution  over  single-use  coherent-state  alphabet  achieves 
the  rate  region. 


3.4.6  Noiseless  bosonic  broadcast  channel  with  M  receivers 


Let  us  now  consider  a  bosonic  broadcast  channel  in  which  the  transmitter  Alice  (A) 
sends  independent  messages  to  M  receivers,  Yq,  ... ,  Ym- i-  Let  us  label  Alice’s  modal 
annihilation  operator  as  a,  and  the  annihilation  operators  for  the  receivers  FJ  as 
V/  G  (0, ...  ,M  —  1}.  In  order  to  characterize  the  bosonic  broadcast  channel  as  a 
quantum-mechanically  correct  representation  of  the  evolution  of  a  closed  system,  we 
must  incorporate  M  —  1  environment  inputs  {Ei, . . . ,  EM_\ }  along  with  the  trans¬ 
mitter  A,  such  that  the  M  output  annihilation  operators  are  related  to  the  M  input 
annihilation  operators  through  a  unitary  matrix,  i.e., 


y  i 


\  Vm—i  ) 


=  u 


I  a  \ 


ei 


y  e-M-i  J 


(3.93) 
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fc=0 


Figure  3-12:  An  M- receiver  noiseless  bosonic  broadcast  channel.  Transmitter  Al¬ 
ice  (A)  sends  independent  messages  to  M  receivers,  Y0,...,Ym_i.  We  have  la¬ 
beled  Alice’s  modal  annihilation  operator  as  a,  and  those  of  the  receivers  Yi  as  yi, 
V/  G  {0, ...  ,M  —  1}.  In  order  to  characterize  the  bosonic  broadcast  channel  as  a 
quantum-mechanically  correct  representation  of  the  evolution  of  a  closed  system,  we 
must  incorporate  M  —  1  environment  inputs  {E{ , . . . ,  EM_\ }  along  with  the  trans¬ 
mitter  A  (whose  modal  annihilation  operators  have  been  labeled  as  {ei, . . . ,  e«_ i}), 
such  that  the  M  output  annihilation  operators  are  related  to  the  M  input  annihi¬ 
lation  operators  through  a  unitary  matrix,  as  given  in  Eq.  (3.93).  For  the  noiseless 
bosonic  broadcast  channel,  all  the  M  —  1  environment  modes  e*,  are  in  their  vacuum 
states.  The  transmitter  is  constrained  to  at  most  N  photons  on  an  average  per  chan¬ 
nel  use,  for  encoding  the  data.  The  fractional  power  coupling  from  the  transmitter 
to  the  receiver  W  is  taken  to  be  rjk-  We  have  labeled  the  receivers  in  such  a  way, 
that  1  >  770  >  T)i  >  ■  ■  ■  >  r/M-i  A  0.  This  ordering  of  the  transmissivities  renders 
this  channel  a  degraded  quantum  broadcast  channel  A  — >  Y0  —■ >  . . .  —> >  Ym-  1  (See 
Fig.  3-13).  The  fractional  power  coupling  from  E ^  to  1)  has  been  taken  to  be  rjki-  For 
M  =  2,  the  above  channel  model  reduces  to  the  familiar  two-receiver  beam  splitter 
channel  model  as  given  in  Fig.  3-7. 
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where  {ei, . . . ,  £m-  1}  are  the  modal  annihilation  operators  of  the  M  —  1  environment 
modes  (see  Fig.  3-12).  The  unitary  matrix  describing  the  channel  can  be  expressed 
in  the  most  general  form  as: 


*  y/rjo  v^ho^10  •  •  •  \/Vm- i,oe^M_1’0  ^ 

JJ  _  V*h  vW^11  •  •  •  y/VM-l^e^-1’1  g4 

y  y/rjM- 1  ) 

where  {770, . . . ,  r/M-i}  are  the  transmissivities  (fractional  power  couplings)  from  the 
transmitter  A  to  the  M  —  1  receivers  Y0>  •  •  • ,  Yw-1  •  Without  loss  of  generality,  we 
have  numbered  the  receivers,  so  that  the  transmissivities  are  in  decreasing  order,  i.e., 

1  >  Vo  >  Vi  >  ■  ■  ■  >  Vm-i  >  0.  (3.95) 


The  power  coupling  from  the  environment  mode  to  the  output  mode  iji  is  rjki- 
Without  loss  of  generality,  the  phases  for  the  entries  of  the  first  column  of  U  have 
been  taken  to  be  0,  as  an  overall  phase  is  inconsequential  in  each  of  the  M  —  1 
input-output  relations, 

M— 1 


Vk  =  s/rjka  +  ^2  A frtik^ei- 

1=1 


(3.96) 


The  fractional  power-couplings  must  satisfy  the  following  normalization  constraints, 


M— 1 

J2vk 

k= 0 

M—l 

Y  r>jk 

=  1, 

=  1, 

V/G{1,.. 

. ,  M  —  1}  , 

(3.97) 

(3.98) 

k= 0 

M—l 

Vk  +  Vik 

1=1 

=  1, 

V/c  G  {0, . 

. . ,  M  —  1}  . 

(3.99) 

Theorem  3.5  —  For  the  noiseless  bosonic  broadcast  channel,  i.e.,  when  the  environ¬ 
ment  modes  {e*,  :  1  <  k  <  M  —  1}  are  in  a  product  of  M  —  1  vacuum  states, 


Transmitter 


Degraded  receivers 


A  >Y0  ►Yi  — ►  •  •  *  —+Ym- 2  ^Ym-i 


Vo  =  Vvo  a  +  \A  -  Vo  fo 

Vk  =  +  \/t  -  ^  A.  Vfc  6  [1,  M  —  1] 


Figure  3-13:  An  equivalent  stochastically  degraded  model  for  the  M-receiver  noiseless 
bosonic  broadcast  channel  depicted  in  Fig.  3-12.  If  the  receivers  are  ordered  in  a  way 
such  that  the  fractional  power  couplings  ?/*.  from  the  transmitter  to  the  receiver  Yk  are 
in  decreasing  order,  the  quantum  states  at  each  receiver  Yk.  for  k  e  {1, . . .  ,M  —  1}, 
can  be  obtained  from  the  state  received  at  receiver  Yk-\  by  mixing  it  with  a  vacuum 
state,  through  a  beam  splitter  of  transmissivity  r\kjr\k^\.  This  equivalent  representa¬ 
tion  of  the  M-receiver  bosonic  broadcast  channel  confirms  that  the  bosonic  broadcast 
channel  is  indeed  a  degraded  broadcast  channel,  whose  capacity  region  is  given  by 
the  infinite- dimensional  (continuous- variable)  extension  of  Yard  et.  al.’s  theorem  in 
Eqs.  (3.38). 


and  with  an  input  mean  photon-number  constraint  (a) a)  <  N,  the  ultimate  capacity 
region16  is  given  by 


Rk  <  g(Vkf3k+iN)  -  g(rjkPkN),  k  G  {0, . . . ,  M  —  1},  (3.100) 


where, 

0  —  Po  <  Pi  <  ...  <  Pm- i  <  Pm  =  1-  (3.101) 

Proof  [Achievability]  —  Using  the  infinite-dimensional  (continuous-variable)  exten¬ 
sion  of  Eqs.  (3.38),  the  n  —  1  rate-region  for  the  bosonic  broadcast  channel  using 


16Note  the  similarity  with  the  capacity  region  for  the  classical  Gaussian  broadcast  channel,  as 
given  in  Eq.  (3.22),  with  N  =  0.  Also  note  that  Eq.  (3.100)  reduces  to  the  two-user  noiseless  bosonic 
broadcast  capacity  region,  as  given  in  Eqs.  (3.41)  and  (3.42),  with  the  substitutions  ?7o  =  an(i 
m  =1  -  r). 
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coherent-state  encoding  is  given  by17  (see  Fig.  3-13  and  Fig.  3-14  for  notation): 


Rq  < 
Rk  < 


Pti(ti)S  ij  pj4|T1(a|ri)|v/r/0a)(vd/0a|d2a  )  d2n 
r 

PTk+l{Tk+ i)x  (; PTk\Tk+1(rk\Tk+1),p^ )  d2rfc+i 
PTk+1(rk+ 1)  (s(  [pn  I Tk+1  {rk  I  rk+ 1  )p% d 2rk 


-  I PTk\Tk+1  ( Tk\rk+1)S  (p%)  d 2T^j  d2Tfc+i,  for  k  G  {1, . . . ,  M  -  2}  , 

Rm- i  <  x{ptm-Atm-i),pX%: i1) 

=  5  (f  ptm-i(tm-i)’P%:1S) 

~  f ' Ptm-Atm-^S  [pXmZI)  d2TM-i  (3.103) 


where  we  need  to  maximize  the  above  rate  region  {R0, . . . ,  RM-\}  over  all  joint  distri¬ 
butions  pTm_1(tM-i)ptm_2|m_1(tm-2|tm-i)-  •  ■PT1\T2{ri\r2)pA\T1{a\T1)  subject  to  (|a|2)  < 
N.  Note  that  A,  and  the  auxiliary  random  variables  T±, ,  Tm- i  are  complex- valued, 
and  the  second  term  in  the  R$  bound  (see  (3.38))  vanishes,  because  the  von  Neumann 
entropy  of  a  pure  state  is  zero. 

Let  us  associate  with  each  random  variable  Tk ,  a  quantum  system,  i.e.  a  coherent- 
state  alphabet  { | ) }  and  a  modal  annihilation  operator  tk,  V/e  e  {1, . . . ,  M  —  1}.  In 


17Here,  we  use  a  continuous-variable  version  of  the  notation  we  used  in  Eqs.  (3.38).  When  the 
cardinalities  d  and  711,  1  <  k  <  M  —  1  are  finite,  and  we  are  using  coherent  states,  we  end  up 
with  a  finite  number  of  possible  transmitted  states,  which  leads  to  a  finite  number  of  possible  states 
received  by  Bob  and  Charlie.  To  be  more  explicit,  let  us  limit  the  auxiliary-input  alphabets  (71, 
1  <  k  <  M  —  1)  -  and  hence  the  input  (A)  and  the  output  alphabets  (Yk,  0  <  k  <  M  —  1)  - 
to  coherent  states  in  the  finite-dimensional  subspace  spanned  by  the  Fock  states  { |0),  1 1) , . . . ,  |A")}, 
where  K  N.  Applying  the  extension  of  Yard  et  al.’s  theorem  to  M  receivers  (3.38),  the  Hilbert 
space  spanned  by  these  states  then  gives  us  a  broadcast  channel  capacity  region  that  must  be  strictly 
an  inner  bound  of  the  rate  region  given  by  Eqs.  (3.103).  In  the  limit  that  we  choose  K  sufficiently 
large,  clearly  the  rate-region  expressions  given  by  Eqs.  (3.38)  can  be  brought  to  as  close  as  we  wish, 
to  those  given  by  Eqs.  (3.103).  The  summations  in  Eqs.  (3.38)  get  replaced  by  integrals.  The 
collective  message  index  j  is  now  replaced  by  the  complex  number  a,  the  indices  ik  are  replaced  by 
Tfc,  and  the  density  matrices  of  the  conditional  received  states  are  given  by:  , 

Prk  =  J  ■■■  J  P^|T1(a|ri)pTl|T2(Ti|r2)...pTfc_1|Tfc(Tfc_i|rfc)/5^d2rfe_1...d2r1d2a,  (3.102) 

where,  p£k  =  j  ^/rfkOt) (y/rjkOt\  is  the  state  received  by  the  receiver  Yk,  when  the  transmitter  sends  a 
coherent  state  p £  =  |a)(a|. 
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Markov  chain  of  auxiliary  classical  complex  valued  random  variables 


Transmitter 


Tm-1 


"►  TM- 2 


t2 


-+  T\  - >A 


Um-1 


U2 


Ui 


t2 - ►v — ►  it  —  —►a 


72 


•71 


a  =  yjl  -  7i  4  +  \/7i  «i 
4-1  =  +  v'TfeWfc,  Vfc  E  [2, M  -  1] 


Figure  3-14:  In  order  to  evaluate  the  capacity  region  of  the  M-receiver  noiseless 
bosonic  degraded  broadcast  channel  depicted  in  Fig.  3-13  using  a  coherent-state  input 
alphabet  (|a)},  a  G  C  and  (a' a)  =  (|a|2)  <  N,  we  choose  the  M  —  l  auxiliary  classical 
Markov  random  variables  (in  Eqs.  (3.35))  as  complex-valued  random  variables  Tf., 
k  G  —  1},  taking  values  77  G  C.  In  order  to  visualize  the  postulated 

optimal  Gaussian  distributions  for  the  random  variables  Tfc,  let  us  associate  with 
Tk,  a  quantum  system,  i.e.,  a  coherent-set  alphabet  {(77)}  and  modal  annihilation 
operator  tk,  Vfc.  In  accordance  with  the  Markov  property  of  the  random  variables 
Tk,  let  tjvtf-i  be  in  an  isotropic  zero-mean  Gaussian  mixture  of  coherent-states  with 
a  variance  N  (see  Eq.  (3.104)),  and  for  k  G  {1, . . . ,  M  —  2},  let  tk  be  obtained  from 
tk- 1-1  by  mixing  it  with  another  mode  Uk+i  excited  in  a  zero-mean  thermal  state  with 
mean  photon  number  N,  through  a  beam  splitter  with  transmissivity  1  —  7^+1,  as 
shown  in  the  figure  above,  for  some  r)k+ 1  £  (0,1).  We  complete  the  Markov  chain 
Tm-  1  — >  ■  ■  ■  — >  T\  — >  A,  by  obtaining  the  transmitter  mode  a  by  mixing  t\  with  a 
mode  hi  excited  in  a  zero-mean  thermal  state  with  mean  photon  number  N,  through 
a  beam  splitter  with  transmissivity  1  —  71,  for  71  G  (0, 1).  The  above  setup  of  the 
auxiliary  modes  gives  rise  to  the  distributions  given  in  Eqs.  (3.104),  which  we  use  to 
evaluate  the  achievable  rate  region  of  the  M-receiver  bosonic  broadcast  channel  using 
coherent-state  encoding. 
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accordance  with  the  Markov  property  of  the  random  variables  Tkl  let  tju-i  be  in 
an  isotropic  zero-mean  Gaussian  mixture  of  coherent-states  with  a  variance  N  (see 
Eq.  (3.104)),  and  for  k  G  —  2},  let  tk  be  obtained  from  tk+i  by  mixing 

it  with  another  mode  uk+ 1  excited  in  a  zero-mean  thermal  state  with  mean  photon 
number  N,  through  a  beam  splitter  with  transmissivity  1—  7^+1,  as  shown  in  Fig.  3-14, 
for  real  numbers  7 k+\  G  (0, 1).  We  complete  the  Markov  chain  Tm-  1  — > ►  •  •  •  — »  Tf  — *  A, 
by  obtaining  the  transmitter  mode  a  by  mixing  t\  with  a  mode  U\  in  a  vacuum  state, 
through  a  beam  splitter  with  transmissivity  1  —  71,  for  7x  G  (0, 1).  This  setup  of  the 
auxiliary  modes  gives  rise  to  the  distributions  given  below,  which  we  use  to  evaluate 
the  achievable  rate  region  using  coherent-state  encoding: 


Pa\tM\tP 

PTk\Tk+1ijk\Tk+1 ) 


1 


TT'yiN 

1 

n'yk+iN 

iTxp 


exp 


exp 


1  y/1  ~  7iri  ~  Q| 

7i  N 

|  \/l  —  7fc+lTfe-t-l  ~  Tk  | 2 


7k+iN 


N 


for  k  G  {1, ... ,  M  —  2}  , 
(3.104) 


Substituting  Eqs.  (3.104)  into  Eqs.  (3.103),  we  get 


Ro  <  g(v  0P1N), 

Rk  <  g(rjkPk+ lN)  -  g(r)kpkN),  for  k  G  {1, . . . ,  M  -  2}  , 

Rm-i  <  g(rjM-iN)  -  g(r)M-iPM-iN),  (3.105) 


where  we  define 


k 

Pk  =  1  -  ]J(1  -7 *),  for  k  G  {1, . . . ,  M  -  1}  .  (3.106) 

i=l 


By  further  defining  (30  =  0,  and  Pm  —  1,  we  have  by  construction,  0  =  Pq  <  Pi  < 
. . .  <  Pm-  1  <  /3m  =  1-  With  these  definitions,  Eqs.  (3.105)  reduce  to  the  rate-region 
expression  given  in  Eq.  (3.100).  Hence  the  postulated  rate  region  is  achievable  using 
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single-use  coherent  state  encoding. 


Proof  [Converse]  —  Our  goal  in  proving  the  converse  is  to  show  that  any  achievable 
rate  M-tuple  (R0,...,RM- 1)  must  be  inside  the  ultimate  rate-region  proposed  by 
Eqs.  (3.105).  Let  us  assume  that  (Rq, . . . ,  Rm~ i)  is  achievable.  Using  the  notation 
in  Eq.  (3.33),  let  {xn(m0, . . .  ,mM- 1)},  and  POVMs  }, 

{A^}  comprise  a  (2nR°, . . .  ,2nRM~1,n,e)  code  in  the  achieving  sequence.  Let  us 
suppose  that  the  receivers  Y0, . . . ,  YM_X  store  their  respective  decoded  messages  in 
registers  Wo, . . . ,  Wm- i-  By  assuming  a  good  source  encoder  prior  to  the  broadcast 
channel-encoder,  it  is  fair  to  assume  a  uniform  distribution  over  the  messages,  i.e., 
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Lemma  3.6  —  For  every  k  G  {1, _ ,  M  —  1},  3/3k  G  [0, 1],  s.t.18 


1)S  (i-i)  =  'F9  {Vk-iPkN)  .  (3.111) 

-f-1 

Proof —  We  have 

0  <  (pV1)  <  5  (p^-1)  <  ng(r}k-iN),  (3.112) 

where  the  first  inequality  follows  from  the  non- negativity  of  von-Neumann  entropy. 
The  second  inequality  follows  from  concavity  of  von-Neumann  entropy  or  equivalently 
from  the  non- negativity  of  Holevo  information  (see  footnote  15),  because 


Avn 

p  k~ i  = 


'  *  k 


V 


The  third  inequality  above  is  due  to  the  fact  that  the  maximum  entropy  of  a  71- 
mode  state  with  a  mean  photon  number  n  per  mode,  is  given  by  ng(n).  From  the 
monotonicity  of  the  function  g(-),  there  must  therefore  exist  a  real  number  (3k  G  [0, 1], 


18We  defined  earlier  in  this  chapter  {mo, . . . ,  TOm-i}  —  1  to  be  a  collective  index  for  the 

A n  Yn 

M  messages  that  Alice  encodes  into  her  n-use  transmitted  codeword  state  pA  M- 1,  and  p  kM_1  was 

m0  rm0 

defined  to  be  the  state  received  by  Yk  over  n  successive  channel  uses.  We  also  used  the  compact 
notation  for  the  vectors  of  random  variables  {Wk,  ■  ■ . ,  Yjf  represents  the  n-use 

quantum  system  of  the  kth  receiver.  By  averaging  a  conditional  received  state  that  is  indexed  by  a 
set  of  messages  m^-1,  over  the  probability  mass  function  of  a  subset  of  the  message-sets  W^_1,  we 
get  a  new  conditional  received  state  now  indexed  only  by  the  remaining  (smaller  set  of)  messages. 
The  received  state  that  has  been  averaged  over  all  messages  is  not  indexed  by  any  message.  Also,  by 
taking  the  trace  of  a  joint  conditional  received  state  over  a  set  of  receiver  Hilbert  spaces,  we  obtain 
the  conditional  received  state  for  the  remaining  (smaller  set  of)  receivers.  Thus,  the  following  (and 
other  similar)  identities  hold: 


Pw“-1(mfe+l1)^mf-i 

(3.108) 

M—l 
mk  + 1 

(3.109) 

rriM- 1 

(~Yk-YM- 1\ 

(3.110) 
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such  that 

=  ng  (77^—1  (3kN)  ,  (3.113) 

-f-1 

which  completes  the  proof  of  Lemma  3.6. 


Now,  as  (Ro, . . . ,  Rm-i)  is  an  achievable  rate  M-tuple,  there  exist  ek,n  — >  0  as 
n  — >  00,  for  k  G  {0, . . . ,  M  —  1},  such  that, 


0  <  nRk  = 

H(Wk) 

< 

I(Wk;  Wk )  +  nek,n 

(3.114) 

< 

X  ( Pwk(mk),pmk )  +  nek,n 

(3.115) 

< 

Pw^i^if+i^X  (mK).pIVi)  +neM, 

(3.116) 

where  I(WklWk)  =  H(Wk )  —  H(Wk\Wk)  is  the  Shannon  mutual  information.  In¬ 
equality  (3.114)  follows  from  Fano’s  inequality,  (3.115)  follows  from  the  Holevo’s 
bound  [27,  28,  29],  and  (3.116)  follows  from  the  concavity  of  Holevo  information, 
as  pmk  =  EmM-1PwM-1(miw  )P  kM~i-  Specializing  inequality  (3.116)  to  k  =  0  we 

J1fc  +  1  V/e+l  ' 

obtain, 


riRn  < 


l)x  (pw0(rn0),p^M_^J  +ne0,„ 

mf-1 

<  (  ^2  'Pwo(m0)p^M-i  )  +  ne{), 

M— 1  V  mo  / 


=  S  dW“-'  (mi  f  ^  +ne0,' 

mf"1 

=  ng(r]o(3iN)  +  ne0,n, 


(3.117) 

(3.118) 

(3.119) 

(3.120) 


where  inequality  (3.118)  follows  from  dropping  out  the  second  term  of  Holevo  in¬ 
formation  in  (3.117).  Inequality  (3.120)  follows  from  Lemma  3.2,  for  k  —  1.  For 
k  G  {1, . . . ,  M  —  2},  continuing  from  (3.116)  we  have, 
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nRk  <  £  Pwr+ii(m 


M-l\ 

k+ 1  J 


■*fc+i 


-S'  (  ^Pwk{rnk)p\,_,  J  -  (p^m- i)  |  +  nefc,r 

\  rnk  /  mk 

Y1  +nekin  (3.121) 

-f+i1  V  "+1/  -f1 

ng  (vkPk+iN)  -  ^2  Pwf-1!111^1)5  (^f  0  +  nek-ni  (3.122) 


where  (3.121)  and  (3.122)  follow  from  the  definition  of  Holcvo  information  and  Lemma 
3.2  respectively.  Next,  we  shall  bound  the  second  term  in  (3.122).  Let  us  define 
N^m~!  .  to  be  the  mean  photon  number  of  the  jth  symbol  P^m-i  of  the  n-symbol 
codeword  pAnM-i ,  whose  mean  photon  number  is  given  by  NAM_1  =  ^  X7/=i  N 


n  ^J=i  *  j' 

Y 


Hence,  pk-iNAM~i  .  is  the  mean  photon  number  of  the  jth  symbol  p  M_i  of  the  n- 

yn  _ 

symbol  codeword  p  fcM-i ,  whose  mean  photon  number  is  given  by  Pk-\NAM~i-  The 

mfe  ‘  mfc 

overall  mean  photon  number  of  the  transmitter  codeword  per  channel  use  N,  is  thus 
given  by  averaging  NAM_l  over  the  codebooks  W^-1,  i.e., 


N  =  2~nj:^Rj  V  Nam _i. 

mk 

-r1 

From  the  non- negativity  of  von-Neumann  entropy,  the  fact  that  the  maximum  von 
Neumann  entropy  of  a  single-mode  bosonic  state  with  mean  photon  number  N  is 
given  by  g(N),  and  the  concavity  of  g(x),  we  have  the  following  inequalities: 


i= i 


0  <  S  (pJm-i)  <55^  \^k-lNt^-\j)  -  n9 


(3.123) 


Therefore,  there  must  exist  real  numbers  /3mM-i  E  [0,1],  Vmf  1  G  1,  such  that 


5  (  =  ng 

nk  /  \  k  k 


(3.124) 


Because  of  the  degraded  nature  of  the  channel,  jjk  =  \Zpk/Vk-iilk-i+\/^  ~  (Vk/Vk-i)fk, 
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with  fk  in  a  vacuum  state  (see  Fig.  3-12).  Hence,  using  Eq.  (3.124)  and  strong  con¬ 
jecture  2  (see  chapter  4),  we  have 


S  (/Cf-1)  -  ng  •  (3.125) 

Taking  an  average  of  both  sides  of  Eq.  (3.124)  over  the  codebooks  Wj)7-1,  and  using 
Lemma  3.2,  we  have 

**t”  1 

=  ng  (r/h-iPkN)  .  (3.126) 


n 


Ad  —  1 

2nT,j=k  j 


9  Vk-iPmM 


Equation  (3.126)  and  a  theorem  on  a  property  of  the  g(-)  function  (see  Appendix  C), 
then  gives  us 


Z  J—*  M—l 


i)  >  ng  (rjkPkN)  .  (3.127) 


Taking  an  average  of  both  sides  of  Eq.  (3.125)  over  the  codebooks  Wj)7  1,  and  using 
Eq.  (3.127),  we  get 


E  (pYS-')  ^  Z 


-lNA 


>  ng  (rjkPkN)  . 


(3.128) 


Combining  Eqs.  (3.122)  and  (3.128),  we  finally  get  the  desired  bound  for  Rk,  for 
k  G  {1, . . . ,  M  —  2},  i.e., 


nRk  <  ng  {gk(3k+ iN)  -  ng  {gk(3kN)  +  neKn.  (3.129) 

Since  nRk  >  0,  the  monotonicity  of  g(-)  implies  that 

Pk+  i>Pk,  Vfc  6  {1, . . . ,  M  —  2}  .  (3.130) 
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To  prove  the  final  piece  of  the  converse  proof,  i.e.,  to  prove  that  the  proposed  rate 
bound  for  Rm-\  holds,  we  proceed  as  follows: 


tiRm-  i  —  H(Wm- i) 


<  I(WM-u  WM-i)  +  neM-i,n 

/  yn  \ 

<  X  +  neM-l,n 


(3.131) 

(3.132) 


< 

< 


s 

£ 

PWi 

\rn 

)  Pm  M_i 

I  _  PWm~ 

)  +  neM-i,n 

ynM- 

1 

J  mM_1 

S(pYM-i 

)- 

y  pwm 

_!  (■ mM - 

-1  )s  f pmi, +  neM-l,n 

(3.133) 

rriM-i 

ng 

{j]M- 

1 N) 

^  /  yn  \ 

-  2^  PWM-i(mM-l)S  [pmM- 1 

)  +  n€M—l,n 

(3.134) 

rriM- 1 

ng 

( VM - 

i  N) 

-  ng  (t]m 

i 

i 

N)  +  neM-i'ni 

(3.135) 

where  inequality  (3.131)  follows  from  Fano’s  inequality,  (3.132)  results  from  the 
Holevo  bound,  (3.134)  follows  from  the  fact  that  the  maximum  von  Neumann  en¬ 
tropy  of  a  single-mode  bosonic  state  with  mean  photon  number  N  is  given  by  g(N). 
The  last  inequality  (3.135)  follows  from19  Eq.  (3.128)  with  k  =  M  —  1.  As  ek)U  — ■>  0  as 
n  — ■>  oo,  going  to  the  limit  of  large  block  length  codes,  Eqs.  (3.120),  (3.129),  (3.130) 
and  (3.135),  along  with  the  definitions  /30  =  0,  and  /3m  =  1,  we  have  shown  that  if 
(i?o, . . . ,  Rm-i)  is  an  achievable  rate  M-tuple,  then  they  must  satisfy, 


Rk  <  g(VkPk+ iN)  -  g(r]kpkN),  k  e  {0, . . . ,  M  -  1},  (3.136) 


for  real  numbers  (3k  satisfying 


0  —  A)  <  Pi  <  •  •  •  <  (3m-i  <  /3m  —  1,  (3.137) 


which  is  what  we  set  out  to  prove. 


19Note  that  the  same  method  we  used  to  bound  the  second  term  in  Eq.  (3.122)  for  k  £ 
{1, . . .  ,M  —  2}  can  also  be  used  for  k  =  M  —  1.  All  the  steps  from  Eq.  (3.122)  to  Eq.  (3.128) 
follow  through  exactly  in  the  same  way  if  we  substitute  k  =  M  —  1  everywhere. 


108 


3.4.7  Thermal-noise  bosonic  broadcast  channel  with  M  re¬ 


ceivers 

Consider  an  extension  of  the  noiseless  M-receiver  bosonic  broadcast  channel  as  de¬ 
picted  in  Fig.  3-12,  in  which  each  environment  mode  e*,,  for  k  G  {1, . . .  ,M  —  1},  is  in 
a  zero-mean  thermal  state  with  mean  photon  number  N  (see  Eq.  (3.67)).  This  chan¬ 
nel  can  also  be  equivalently  represented  by  a  degraded  model  as  depicted  in  Fig.  3-13, 
in  which  each  of  the  modes  fk,  for  fce  { 1,...,M  —  1},  is  now  in  a  zero-mean  thermal 
state  with  mean  photon  number  N. 

Theorem  3.7  —  With  a  mean  photon  number  constraint  of  N  photons  per  channel 
use  at  the  transmitter,  the  ultimate  capacity  region  of  the  thermal-noise  bosonic 
broadcast  channel,  with  uniform  noise  coupling  of  N  photons  on  an  average  in  each 
mode,  can  be  achieved  by  coherent-state  encoding  with  an  isotropic  Gaussian  prior 
distribution.  Given  the  truth  of  strong  conjectures  1  and  3,  the  ultimate  capacity 
region  is  given  by20 

Rk  <  g(rikf3k+1N  +  (l-rik)N)-g(rikf3kN+(l-rik)N),  k  G  (0, . . . ,  M  -  1},  (3.138) 
for  real  numbers  /3k  satisfying 

0  =  A)  <  Pi  <  ■  ■  ■  <  (3m- i  <  Pm  =  1-  (3.139) 

Proof  -  The  proof  of  this  theorem  follows  exactly  as  in  the  proof  of  the  ultimate 
capacity  region  of  the  noiseless  bosonic  broadcast  channel  with  M  receivers,  using 
ideas  from  the  capacity-region  proof  for  the  thermal-noise  bosonic  broadcast  channel 
with  two  receivers.  We  omit  the  proof  from  the  thesis  due  to  its  notational  complexity. 


20Note  that  the  expression  for  this  capacity  region  resembles  the  expression  for  the  capacity  region 
of  the  M -receiver  classical  Gaussian  broadcast  channel,  as  given  in  Eq.  (3.22).  The  only  difference 
between  these  two  capacity-region  expressions  is  that  the  Bergman’s  gc(')  function  in  the  classical 
Gaussian  case  is  replaced  by  the  g(-)  function  in  the  quantum  bosonic  case. 
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3.4.8  Comparison  of  bosonic  broadcast  and  multiple-access 
channel  capacity  regions 

In  classical  information  theory,  Vishwanath  et.  al.  [53]  established  a  duality  between 
what  is  termed  the  dirty  paper  achievable  region  (but  recently  proved  to  be  the  ulti¬ 
mate  capacity  region  [56])  for  the  classical  Multiple-Input-Multiple-Output  (MIMO) 
Gaussian  broadcast  channel  (BC)  and  the  capacity  region  of  the  MIMO  Gaussian 
multiple-access  channel  (MAC),  which  is  easy  to  compute.  Using  this  duality,  the 
computational  complexity  required  for  obtaining  the  capacity  region  for  the  MIMO 
broadcast  channel  was  greatly  reduced.  The  duality  result  states  that  if  we  were 
to  trace  out  the  capacity  regions  of  the  MIMO  Gaussian  MAC  with  a  certain  fixed 
value  of  the  total  received  power  P  and  channel-gain  values,  and  for  all  the  various 
possible  power-allocations  between  the  users,  the  corners  of  all  those  capacity  regions 
would  trace  out  the  capacity  region  of  the  MIMO  Gaussian  broadcast  channel  with 
transmitter  power  P  and  the  exact  same  channel-gain  values.  Unlike  this  classical 
result,  it  turns  out  that  the  capacity  region  of  the  bosonic  broadcast  channel  using 
coherent-state  inputs  is  not  the  exact  dual  of  the  envelope  of  the  capacity  regions 
of  a  multiple-access  channel  (MAC)  using  coherent-state  inputs.  In  Figure  3-15,  for 
r)  =  0.8,  and  N  =  15,  we  show  that  the  capacity  region  of  the  bosonic  broadcast  chan¬ 
nel  lies  below  the  envelope  of  the  multiple-access  capacity  regions  of  the  dual  MAC. 
The  capacity  region  of  the  bosonic  MAC  using  coherent-state  inputs  was  first  com¬ 
puted  by  Yen  [11],  So,  assuming  that  the  optimum  modulation,  coding,  and  receivers 
are  available,  on  a  fixed  beam  splitter  with  the  same  power  budget,  more  collective 
classical  information  can  be  sent  when  this  beam  splitter  is  used  as  a  multiple-access 
channel,  as  opposed  to  when  it  is  used  as  a  broadcast  channel.  We  believe  that  the 
duality  between  the  classical  MIMO  MAC  and  BC  capacity  regions  arises  solely  due 
to  the  special  structure  of  the  log(-)-function  in  the  capacity  region  expressions  of  the 
classical  Gaussian-noise  channels,  rather  than  for  any  physical  reason.  The  capacity 
expressions  for  the  quantum  bosonic  channels  have  the  g(-)-function  instead  which 
does  not  exhibit  the  same  duality  properties. 
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3.5 


Figure  3-15:  Comparison  of  bosonic  broadcast  and  multiple-access  channel  capacity 
regions  for  rj  =  0.8,  and  N  =  15.  The  rates  are  in  the  units  of  bits  per  channel 
use.  The  red  line  is  the  conjectured  ultimate  broadcast  capacity  region,  which  lies 
below  the  green  line  -  the  envelope  of  the  MAC  capacity  regions.  Assuming  that  the 
optimum  modulation,  coding,  and  receivers  are  available,  on  a  fixed  beam  splitter 
with  the  same  power  budget,  more  collective  classical  information  can  be  sent  when 
this  beam  splitter  is  used  as  a  multiple-access  channel,  as  opposed  to  when  it  is  used  as 
a  broadcast  channel.  This  is  unlike  the  case  of  the  classical  MIMO  Gaussian  multiple- 
access  and  broadcast  channels  (BC),  where  a  duality  holds  between  the  MAC  and 
BC  capacity  regions. 
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3.5  The  Wiretap  Channel  and  Privacy  Capacity 


The  term  “wiretap  channel”  was  coined  by  Wyner  [57]  to  describe  a  communica¬ 
tion  system,  in  which  Alice  wishes  to  communicate  classical  information  to  Bob  over 
a  point-to-point  discrete  memoryless  channel  that  is  subjected  to  a  wiretap  by  an 
eavesdropper  Eve.  Alice’s  goal  is  to  reliably  and  securely  communicate  classical  data 
to  Bob,  in  such  a  way  that  Eve  gets  no  information  whatsoever  about  the  message. 
Wyner  used  the  conditional  entropy  rate  of  the  signal  received  by  Eve,  given  Alice’s 
transmitted  message,  to  measure  the  secrecy  level  guaranteed  by  the  system.  He  gave 
a  single-letter  characterization  of  the  rate-equivocation  region  under  the  limiting  as¬ 
sumption  that  the  signal  received  by  Eve  is  a  degraded  version  of  the  one  received  by 
Bob.  Csiszar  and  Korner  later  generalized  Wyner’s  results  to  the  case  in  which  the 
signal  received  by  Eve  is  not  a  degraded  version  of  the  one  received  by  Bob  [58] .  These 
classical-channel  results  were  later  extended  by  Devetak  [59]  to  encompass  classical 
transmission  over  a  quantum  wiretap  channel. 

3.5.1  Quantum  wiretap  channel 

In  earlier  sections  in  this  chapter,  we  have  defined  a  quantum  channel  Ma-b  from 
Alice  to  Bob  to  be  a  trace-preserving  completely  positive  map  that  transforms  Alice’s 
single-use  density  operator  pA  to  Bob’s,  pB  =  Ma-b(pA)-  The  quantum  wiretap 
channel  Ma-be  is  a  quantum  channel  from  Alice  to  an  intended  receiver  Bob  and  an 
eavesdropper  Eve  .  The  quantum  channel  from  Alice  to  Bob  is  obtained  by  tracing 
out  E  from  the  channel  map,  i.e.,  Ma-b  =  Tr#  ( Ma-be ),  and  similarly  for  Ma-e ■  A 
quantum  wiretap  channel  is  degraded  if  there  exists  a  degrading  channel  A f^e_sE  such 
that  Ma-e  =  o  Ma-b- 

The  wiretap  channel  describes  a  physical  scenario  in  which  for  each  successive  n 
uses  of  Ma-be  Alice  communicates  a  randomly  generated  classical  message  m  G  W 
to  Bob,  where  m  is  a  classical  index  that  is  uniformly  distributed  over  the  set,  W, 
of  2nR  possibilities.  To  encode  and  transmit  m,  Alice  generates  an  instantiation 
k  €  K  of  a  discrete  random  variable,  and  then  prepares  n-charmel-use  states  that  after 
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transmission  through  the  channel,  result  in  bipartite  conditional  density  operators 
{pm  k  " } •  A  (2ni?,  n,  e)  code  for  this  channel  consists  of  an  encoder,  xn  :  (Id7,  K)  — >  .4”, 
and  a  positive  operator- valued  measure  (POVM)  {A^”}  on  Bn  such  that  the  following 
conditions  are  satisfied  for  every  m  G  W  21 

1.  Bob’s  probability  of  decoding  error  is  at  most  e,  i.e., 

Tr(pm‘kAm  )  >  1  -  P  V/c,  and  (3.140) 

2.  For  any  POVM  {A ®"}  on  £n,  no  more  than  e  bits  of  information  is  revealed 
about  the  secret  message  m.  Using  j  =  (m,  fc),  this  condition  can  be  expressed, 
in  terms  of  the  ffolevo  information  [27,  28,  29],  as  follows, 

x(Pj^A-E(pf))  <  e-  (3.141) 


Because  ffolevo  information  may  not  be  additive,  the  classical  privacy  capacity 
Cp  of  the  quantum  wiretap  channel  must  be  computed  by  maximizing  over  successive 
uses  of  the  channel,  i.e.,  for  n  being  the  number  of  uses  of  the  channel  [59], 


Cp(Na-be) 


=  sup  max  — 
n  PtWpa|tOF)  n 


x(PT{i),^2pA\T(j\i)pj 
xijpT{i),^JpA\TU\i)p 


yE” 

0 


(3.142) 


where  the  {pf  }  are  density  operators  on  the  Hilbert  space  H®n  of  n  successive 
channel  uses.  The  probabilities  {pi}  form  a  distribution  over  an  auxiliary  classical 
alphabet  T,  of  size  |T|.  The  ultimate  privacy  capacity  is  computed  by  maximizing  the 
expression  specified  in  (3.142)  over  (pr(f)},  {p A\T(j\i)} ■,  { pf and  n.  For  a  degraded 
wiretap  channel,  the  auxiliary  random  variable  is  unnecessary,  and  Eq.  (3.142)  reduces 

21  An,  Bn ,  and  £n  are  the  n-channel-use  alphabets  of  Alice,  Bob,  and  Eve,  with  respective  sizes 
\An\,  \Bn\,  and  |£n|. 
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to 

Cp(Na-be)  =  sup  max-[x(pA(j),pfn)  ~  x(pa(j),  pf)}-  (3.143) 

n  pa  O')  n 

3.5.2  Noiseless  bosonic  wiretap  channel 

The  noiseless  bosonic  wiretap  channel  consists  of  a  collection  of  spatial  and  temporal 
bosonic  modes  at  the  transmitter  that  interact  with  a  minimal-quantum-noise  envi¬ 
ronment  and  split  into  two  sets  of  spatio-temporal  modes  en  route  to  two  independent 
receivers,  one  being  the  intended  receiver  and  the  other  being  the  eavesdropper.  The 
multi-mode  bosonic  wiretap  channel  is  given  by  ^sJ^Aa-BsEs,  where  J\fAs-BsEs  is  the 
wiretap-channel  map  for  the  sth  mode,  which  can  be  obtained  from  the  Heisenberg 
evolutions 


h  =  y/rfsCis  +  \/l  -  Vsfs,  (3.144) 

es  =  \A  -  Vs  as  -  y/rfafs,  (3.145) 

where  the  {as}  are  Alice’s  modal  annihilation  operators,  and  {bs},  {es}  are  the  cor¬ 
responding  modal  annihilation  operators  for  Bob  and  Eve,  respectively.  The  modal 
transmissivities  {77^}  satisfy  0  <  r/.s  <  1,  and  the  environment  modes  {fs}  are  in  their 
vacuum  states.  We  will  limit  our  treatment  here  to  the  single-mode  bosonic  wiretap 
channel,  as  the  privacy  capacity  of  the  multi-mode  channel  can  in  principle  be  ob¬ 
tained  by  summing  up  capacities  of  all  spatio-temporal  modes  and  maximizing  the 
sum  capacity  subject  to  an  overall  input-power  budget  using  Lagrange  multipliers, 
cf.  [9],  where  this  was  done  for  the  multi- mode  single-user  lossy  bosonic  channel. 
Theorem  3.8  —  Assuming  the  truth  of  minimum  output  entropy  conjecture  2  (see 
chapter  4),  the  ultimate  privacy  capacity  of  the  single-mode  noiseless  bosonic  wiretap 
channel  (see  Fig.  3-16)  with  mean  input  photon-number  constraint  (a' a)  <  N  is 

Cp(NA-be)  =  g{r}N)  -  g((  1  -  rj)N)  nats/use,  (3.146) 

for  rj  >  1/2  and  Cp  =  0  for  77  <  1/2.  This  capacity  is  additive  and  achievable  with 
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Figure  3-16:  Schematic  diagram  of  the  single-mode  bosonic  wiretap  channel.  The 
transmitter  Alice  (A)  encodes  her  messages  to  Bob  ( B )  in  a  classical  index  j,  and 
over  n  successive  uses  of  the  channel,  thus  preparing  a  bipartite  state  p^" E"  where 
En  represents  n  channel  uses  of  an  eavesdropper  Eve  ( E ). 


single-channel-use  coherent-state  encoding  with  a  zero-mean  isotropic  Gaussian  prior 
distribution  pa(ch)  =  exp(— |o:|2/iV) /7riV. 


Proof  —  Devetak’s  result  for  the  privacy  capacity  of  the  degraded  quantum  wiretap 
channel  in  Eq.  (3.143)  requires  finite-dimensional  Hilbert  spaces.  Nevertheless,  we 
will  use  this  result  for  the  bosonic  wiretap  channel,  which  has  an  infinite-dimensional 
state  space,  by  extending  it  to  infinite-dimensional  state  spaces  through  a  limiting 
argument22.  Furthermore,  it  was  recently  shown  that  the  privacy  capacity  of  a  de¬ 
graded  wiretap  channel  is  additive,  and  equal  to  the  single-letter  quantum  capacity 


22When  \T\  and  |A|  are  finite  and  we  are  using  coherent  states  in  Eq.  (3.143),  there  will  be  a 
finite  number  of  possible  transmitted  states,  leading  to  a  finite  number  of  possible  states  received 
by  Bob  and  Eve.  Suppose  we  limit  the  auxiliary-input  alphabet  (T) — and  hence  the  input  (A)  and 
the  output  alphabets  ( B  and  E) — to  truncated  coherent  states  within  the  finite-dimensional  Hilbert 
space  spanned  by  the  Fock  states  {  \m)  :  0  <  m  <  M  },  where  M  N.  Applying  Devetak’s  theorem 
to  the  Hilbert  space  spanned  by  these  truncated  coherent  states  then  gives  us  a  lower  bound  on  the 
privacy  capacity  of  the  bosonic  wiretap  channel  when  the  entire,  infinite-dimensional  Hilbert  space 
is  employed.  By  taking  M  sufficiently  large,  while  maintaining  the  cardinality  condition  for  T,  the 
rate-region  expressions  given  by  Devetak’s  theorem  will  converge  to  Eq.  (3.146). 
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of  the  channel  from  Alice  to  Bob  [60],  i.e., 


Cp{Ma  -be)  =  C«(. Ma.  be)  —  Q[1\^a-b),  (3.147) 

where  the  superscript  (1)  denotes  single-letter  capacity.  It  is  straightforward  to  show 
that  if  r)  >  1/2,  the  bosonic  wiretap  channel  is  a  degraded  channel,  in  which  Bob’s 
is  the  less-noisy  receiver  and  Eve’s  is  the  more-noisy  receiver.  The  degraded  nature 
of  the  bosonic  wiretap  channel  has  been  depicted  in  Fig.  3-16,  where  the  quantum 
states  pE'  of  the  constructed  system  E'  are  identical  to  the  quantum  states  pE  for  a 
given  input  quantum  state  pA.  Using  Eq.  (3.147)  for  the  bosonic  wiretap  channel,  we 
have 


Cp(NA-be)  =  max  [S  ( pB )  -  S  ( pE )] 

( a'a)<N 

=  max  _  [A(pB)  —  S(pE')] 

(8tS)<?jjv 

=  a™8?  {max(8tg><7?iV,5(/5s)=x[^,(pS)  -  S(pE’)]} 

0<K<g{r]N) 

=  0 <k^vn){K  ~ 

=  max  _  {K  -  g[(  1  -  p)g~l(K )//?]} 

0  <K<g(r/N) 

=  g(j]N)  —  g((l  —  r))N)  nats/use 

=  Q^\.Na-b)- 


(3.148) 


The  first  equality  above  follows  from  Lemma  3  of  [60].  The  second  equality  follows 
from  M a- be  being  a  degraded  channel.  The  restriction  to  0  <  K  <  g(r)N)  in  the 
third  equality  is  permissible  because  max^j,^^ S(pB)  =  g(r}N).  The  fifth  equal¬ 
ity  follows23  from  minimum  output  entropy  conjecture  2  (see  chapter  4),  which  also 
implies  that  the  optimum  pB  is  a  thermal  state  with  (Eb)  =  r/N.  Hence,  capacity  is  at¬ 
tained  when  Alice  encodes  using  coherent-state  inputs  |a)  with  a  zero-mean  isotropic 

23Here,  g~1(S)  is  the  inverse  of  the  function  g(N).  Because  g(N)  for  N  >  0  is  a  non-negative, 
monotonically  increasing,  concave  function  of  N,  it  has  an  inverse,  g~1(S)  for  S  >  0,  that  is  non¬ 
negative,  monotonically  increasing,  and  convex. 
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Gaussian  prior  distribution  pa(ch)  =  (1/irN)  exp  ( — |or|2/ iV) .  The  sixth  equality  fol¬ 
lows  from  the  monotonicity  of  the  function  g(x)  —  g(rjx )  for  0  <  r)  <  1,  and  equality 
to  the  single-letter  quantum  capacity  follows  from  Eq.  (3.147).  Note  that  the  privacy 
capacity  of  this  channel  is  zero  when  r)  <  1/2.  It  is  straightforward  to  show  that  in 
the  limit  of  high  input  photon  number  N, 

Cp(AfA-BE )  =  Q(V){Na-b)  =  max  {0, 111(77)  -  ln(l  -  v)}  , 

a  result  that  Wolf  et.  al.  [61]  independently  derived  by  a  different  approach  without 
use  of  an  unproven  output  entropy  conjecture. 
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Chapter  4 


Minimum  Output  Entropy 
Conjectures  for  Bosonic  Channels 


In  general,  the  evolution  of  a  quantum  state  resulting  from  the  state’s  propagation 
through  a  quantum  communication  channel  is  not  unitary,  so  that  a  pure  state  loses 
some  coherence  in  its  transit  through  that  channel.  Various  measures  of  a  channel’s 
ability  to  preserve  the  coherence  of  its  input  state  have  been  introduced.  One  of  the 
most  useful  of  these  is  the  channel’s  capacity.  In  this  chapter,  we  will  focus  on  a  dif¬ 
ferent,  but  somewhat  related  measure,  namely  the  minimum  von  Neumann  entropy 
§(£(/)))  at  the  output  of  a  quantum  channel  £  optimized  over  the  input  state  p.  This 
quantity  is  related  to  the  minimum  amount  of  noise  implicit  in  the  channel.  The  out¬ 
put  entropy  associated  with  a  pure-state  input  measures  the  entanglement  that  such 
a  state  establishes  with  the  environment  during  the  communication  process.  Because 
the  state  of  the  environment  is  not  accessible,  this  entanglement  is  responsible  for 
the  loss  of  quantum  coherence,  and  hence  for  the  injection  of  noise  into  the  channel 
output.  Low  values  of  entanglement  established  with  the  environment  correspond 
to  low-noise  communication  channels.  Furthermore,  the  study  of  §  yields  important 
information  about  channel  capacities.  In  particular,  we  have  shown  that  an  upper 
bound  on  the  classical  capacity  derives  from  a  lower  bound  on  the  output  entropy  of 
multiple  channel  uses,  see,  e.g.,  [55].  Finally,  the  additivity  of  the  minimum  entropy 
has  been  shown  to  imply  the  additivity  of  the  classical  capacity  and  of  the  entan- 
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glement  of  formation  [62,  63],  which  is  a  problem  of  huge  interest  to  the  quantum 
information  research  community. 

Our  study  of  minimum  output  entropy  will  be  restricted  to  bosonic  channels  in 
which  the  optical-frequency  electromagnetic  field,  used  as  the  information  carrier, 
interacts  with  a  source  of  additive  thermal  noise.  For  these  channels,  we  proposed 
a  conjecture  for  the  minimum  output  entropy  [10]  that,  if  shown  to  be  true,  would 
prove  the  ultimate  rate  limits  to  point-to-point  bosonic  communications,  as  we  men¬ 
tioned  in  Chapter  2.  Even  though  a  rigorous  proof  of  the  conjecture  is  yet  to  be  seen, 
several  attempts  have  been  made  in  order  to  prove  the  conjecture,  and  partial  results, 
bounds,  and  other  supporting  evidence  have  been  found,  see,  e.g.,  [10,  55,  9,  39].  We 
call  this  conjecture,  the  conjecture  1.  As  we  described  in  the  previous  chapter,  a  ca¬ 
pacity  analysis  of  the  bosonic  broadcast  channel  with  two  receivers  and  no  additional 
noise  led  us  to  an  inner  bound  on  the  capacity  region,  which  we  showed  to  be  the 
ultimate  capacity  region  under  the  presumption  of  a  second  minimum  output  entropy 
conjecture  [12],  the  conjecture  2.  We  further  saw  in  Chapter  3  that  capacity  analysis 
of  the  two-receiver  and  the  general  M-receiver  bosonic  broadcast  channel  with  addi¬ 
tive  thermal  noise  leads  to  an  inner  bound  on  the  capacity  region  achievable  using 
coherent-state  encoding.  We  proved  that  this  inner  bound  is  the  ultimate  capacity 
region  under  the  presumption  of  a  slightly  generalized  version  of  conjecture  2,  which 
we  call  conjecture  3.  We  also  showed  in  Chapter  3  that  proving  the  single-mode  ver¬ 
sion  of  conjecture  2  will  establish  the  privacy  capacity  of  the  lossy  bosonic  channel 
[13].  In  what  follows,  all  these  conjectures  will  be  termed  ‘weak’  when  they  are  ap¬ 
plied  to  single-mode  states,  and  they  will  be  termed  ‘strong’  when  they  are  applied 
to  general  n-rnode  bosonic  states.  The  strong  version  of  each  conjecture  subsumes 
the  respective  weak  version  as  a  special  case.  Neither  the  weak  nor  the  strong  version 
of  these  conjectures  have  been  proven  yet,  but  a  variety  of  supporting  evidence  has 
been  obtained,  especially  for  conjecture  1  [10]. 

We  will  spend  the  next  two  sections  of  this  chapter  describing  each  minimum 
output  entropy  conjecture  and  its  significance,  along  with  the  work  that  has  been  done 
so  far  in  attempting  to  prove  these  conjectures  and  to  obtain  evidence  in  support  of 
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their  validity.  The  final  section  of  this  chapter  discusses  proofs  of  the  strong  versions  of 
each  minimum  output  conjecture  for  Wehrl  entropy,  which  is  an  alternative  measure 
of  entropy  that  provides  a  measurement  of  a  quantum  state  in  phase  space.  The 
Wehrl-entropy  proofs  elucidate  the  thought  process  that  led  us  recently  to  conjecture 
the  Entropy  Photon-Number  Inequality  (EPnl)  [13],  in  analogy  with  the  Entropy 
Power  Inequality  (EPI)  from  classical  information  theory.  The  EPnl  subsumes  all 
the  minimum  output  entropy  conjectures  presented  in  this  chapter,  and  will  be  the 
subject  matter  of  the  next  chapter. 

4.1  Minimum  Output  Entropy  Conjectures 

4.1.1  Conjecture  1 

Weak  Conjecture  1  —  Let  a  lossless  beam  splitter  have  input  a  in  state  pA,  input 
b  in  a  zero-mean  thermal  state  with  mean  photon  number  N,  and  output  c  from 
its  transmissivity-r]  port,  i.e.,  c  =  y/rja  +  \J1  —  r]b.  Then  S(pc),  the  von  Neumann 
entropy  of  output  c,  is  minimized  when  the  input  state  pA  is  in  the  vacuum  state 
(or  any  non-zero-mean  coherent-state),  and  the  minimum  output  entropy  is  given  by 
S(pc)=g((l-ri)N). 

Strong  Conjecture  1  —  Consider  n  uses  of  a  lossless  beam  splitter  in  which  the 
output  modes  of  the  n  uses,  q  :  1  <  i  <  n,  are  related  to  the  input  modes  by 

di  =  v /rjdi  +  yjl  —  rjbi ,  VI  <  i  <  n.  (4.1) 

Let  the  input  modes  bi  :  1  <  i  <  n  be  in  a  product  state  of  mean-photon-number  N 
thermal  states.  Then  putting  all  the  a*  :  1  <  i  <  n  in  their  vacuum  states  (or  equiva¬ 
lently  in  coherent  states  of  arbitrary  mean  values)  minimizes  the  output  von  Neumann 
entropy  of  the  joint  state  of  the  q  :  1  <  i  <  n.  The  resulting  minimum  output  entropy 
is  §(pc")  =  ng{{  1  —  rj)N). 

In  [55],  we  showed  that  proving  strong  conjecture  1  would  complete  the  classical- 
capacity  proof  of  the  point-to-point  bosonic  channel  with  additive  thermal  noise,  and 
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will  also  prove  that  the  capacity  is  achieved  using  a  coherent-state  encoding  and 
an  optimum  detection  scheme  that  employs  joint  measurements  over  long  codeword 
blocks. 

4.1.2  Conjecture  2 

Weak  Conjecture  2  —  Let  a  lossless  beam  splitter  have  input  a  in  its  vacuum 
state,  input  b  in  a  zero-mean  state  with  von  Neumann  entropy  S(pB )  =  g(K),  and 
output  c  from  its  transmissivity -rj  port.  Then  the  von  Neumann  entropy  of  output  c 
is  minimized  when  input  b  is  in  a  thermal  state  with  average  photon  number  K,  and 
the  minimum  output  entropy  is  given  by  S(/3G)  =  g((  1  —  rf)K). 

Strong  Conjecture  2  —  Consider  n  uses  of  the  beam  splitter  in  which  the  output 
modes  of  the  n  uses,  ct  :  1  <  i  <  n,  are  related  to  the  input  modes  by  Eq.  f.l.  Let  the 
input  modes  ck  :  1  <  i  <  n  be  in  a  product  state  of  n  vacuum  states.  Also,  the  von 
Neumann  entropy  of  the  joint  state  of  the  inputs  bi  :  1  <  i  <  n  is  constrained  to  be 
ng(K).  Then,  putting  all  the  bi  :  1  <  i  <  n  in  a  product  state  of  mean-photon-number 
K  thermal  states  minimizes  the  output  von  Neumann  entropy  of  the  joint  state  of  the 
di  :  1  <  i  <  n.  The  resulting  minimum  output  entropy  is  S(pc")  =  ng((  1  —  rj)K). 

In  Chapter  3,  we  showed  that  proving  strong  conjecture  2  would  complete  the 
converse  proof  to  the  capacity  region  theorem  for  the  general  M-receiver  noiseless 
bosonic  broadcast  channel.  Proving  the  conjecture  would  also  establish  the  fact  that 
a  product  coherent-state  encoder  and  optimum  joint  measurement  detectors  at  each 
receiver  achieves  the  ultimate  capacity  region  for  the  noiseless  bosonic  broadcast 
channel. 

4.1.3  Conjecture  3:  An  extension  of  Conjecture  2 

Weak  Conjecture  3  —  Let  a  lossless  beam  splitter  have  input  a  in  a  zero-mean 
thermal  state  with  mean  photon  number  N,  input  b  in  a  zero-mean  state  with  von 
Neumann  entropy  S(pB )  =  g(K),  and  output  c  from  its  transmissivity-g  port.  Then 
the  von  Neumann  entropy  of  output  c  is  minimized  when  input  b  is  in  a  thermal 
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state  with  average  photon  number  K ,  and  the  minimum  output  entropy  is  given  by 
S{pc)=g{r]N  +  {l-r])K). 

Strong  Conjecture  3  —  Consider  n  uses  of  the  beam  splitter  in  which  the  output 
modes  of  the  n  uses,  Ci  :  1  <  i  <  n  are  related  to  the  input  modes  by  Equation  f.l. 
Let  the  input  modes  :  1  <  i  <  n  be  in  a  product  state  of  n  mean-photon-number 
N  thermal  states.  Also,  the  von  Neumann  entropy  of  the  joint  state  of  the  inputs 
hi  :  1  <  i  <  n  is  constrained  to  be  ng(K).  Then,  putting  all  the  hi  :  1  <  i  <  n  in  a 
product  state  of  mean-photon-number  K  thermal  states  minimizes  the  output  von 
Neumann  entropy  of  the  joint  state  of  the  q  :  1  <  i  <  n.  The  resulting  minimum 
output  entropy  is  §(pc" )  =  ng(r}N  +  (1  —  rj)K). 

In  Chapter  3,  we  showed  that  proving  strong  conjecture  3  would  complete  the  con¬ 
verse  proof  to  the  capacity  region  theorem  for  the  general  M-receiver  bosonic  broad¬ 
cast  channel  with  additive  thermal  noise.  Proving  the  conjecture  would  also  establish 
the  fact  that  a  product  coherent-state  encoder  and  optimum  joint  measurement  de¬ 
tectors  at  each  receiver  achieves  the  ultimate  capacity  region  for  the  thermal-noise 
bosonic  broadcast  channel. 


4.2  Evidence  in  Support  of  the  Conjectures 

In  this  section,  we  list  all  the  supporting  evidence  that  has  been  collected,  so  far, 
in  favor  of  the  above  minimum  output  entropy  conjectures.  Most  of  the  supporting 
evidence  we  have,  is  for  conjecture  1,  although  there  is  some  for  the  others. 


1.  Proofs  for  entropy  measures  other  than  von  Neumann  entropy  —  It 

turns  out  to  be  easier  to  work  analytically  with  certain  entropy  measures  that 
are  alternatives  to  the  von  Neumann  entropy,  e.g.,  the  quantum-state  Wehrl 
entropy,  Renyi  entropy,  and  the  Renyi- Wehrl  entropy.  Proofs  for  identical  state¬ 
ments  in  conjectures  1,  2  and  3  have  been  attempted  for  the  above  alternative 
measures  of  entropy.  Following  are  the  results  that  were  obtained. 
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(i)  Wehrl  entropy  is  the  Shannon  differential  entropy  (with  an  offset  of  In  7 r) 
of  the  Husirni  probability  function  Qp(p)  for  the  state  p  [64], 

W(p)  =  -  J  Qp{n)  In  [nQp(p)\d2p,  (4.2) 

=  h(Qp(p))~  lnTT,  (4.3) 

where  Qp(p)  =  {p\p\p)/-K  with  \p)  a  coherent  state.  The  Wehrl  entropy 
provides  a  measurement  of  the  state  p  in  phase  space  and  its  minimum 
value  is  achieved  for  coherent  states  [64],  Conjecture  1  (both  the  strong 
and  weak  forms)  was  proved  for  the  Wehrl  entropy  measure  by  Giovan- 
netti,  et.  al.  [34],  We  have  proven  weak  conjectures  2  and  3  for  Wehrl 
entropy  using  a  technique  similar  to  that  was  used  in  the  Wehrl-entropy 
proof  of  conjecture  1  (see  Appendix  D).  Later,  we  proved  both  the  strong 
and  the  weak  conjectures  1,  2  and  3  by  using  the  Entropy  Power  Inequality 
(EPI)  of  classical  information  theory. 

(ii)  Renyi  entropy  of  order  z,  Sz(p),  of  a  quantum  state  p  is  defined  in  an 
analogous  way  to  the  definition  of  Renyi  entropy  of  order  z  for  a  classical 
random  variable  X  with  probability  mass  function  {pi},  i.e.,  Hz( X)  = 

(-!/(*  -  !))ln(EiPf): 

Sz  (p)  = - - —  lnTr(/P),  for  0  <  z  <  00,  z  ^  1.  (4.4) 

z  —  1 

It  is  a  monotonic  function  of  the  z-purity  of  a  density  operator,  and  reduces 
to  the  definition  of  the  von  Neumann  entropy  in  the  limit  z  — >  1.  Weak 
and  strong  versions  of  conjecture  1  have  been  proven  for  integer-ordered 
Renyi  entropies  for  z  E  {2, 3, . . .}  [34], 

(iii)  Renyi-Wehrl  entropy  of  order  z  is  defined  by 

Wz(p )  =  ~z~Xi  h1  I  >  for  2:  >  1.  (4.5) 
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Thus  the  Wehrl  entropy  is  the  limit  of  Wz(p)  as  z  — >  1.  Weak  conjecture 
1  has  been  proved  for  the  Renyi- Wehrl  entropy  measure  [34], 

2.  Proof  for  Gaussian  states  —  Strong  conjectures  1  and  2  have  been  proven 
for  the  special  case  in  which  the  input  states  are  restricted  to  be  Gaussian, 
and  we  have  shown  them  to  be  equivalent  to  each  other  under  the  Gaussian- 
input-state  restriction  [12].  The  proofs  result  from  the  fact  that  Gaussian  states 
are  completely  characterized  by  their  means  and  covariance  matrices,  and  if  the 
two  inputs  to  a  beam  splitter  are  independent  Gaussian  states,  then  the  outputs 
of  the  beam  splitter  are  a  jointly-Gaussian  state  whose  means  and  covariance 
matrix  are  linear  functions  of  the  means  and  covariance  matrices  of  the  input 
Gaussian  states.  The  Gaussian-state  proof  for  conjecture  1  appeared  in  [10]. 
Weak  conjecture  3  can  be  proved  for  Gaussian-state  inputs,  but  the  strong 
form  of  conjecture  3  hasn’t  been  proved  yet  under  the  Gaussian  input-state 
restriction. 

3.  Majorization  conjecture  and  simulated  annealing  —  In  [10],  we  proposed 
the  majorization  conjecture  (which  is  stronger  than  weak  conjecture  1),  whose 
truth  would  imply  the  truth  of  weak  conjecture  1:  The  output  states  produced 
by  coherent  state  inputs  majorize  all  other  output  states.  By  definition,  a  state 
p  majorizes  a  state  a  (which  we  denote  by  p  y  a),  if  all  ordered  partial  sums 
of  the  eigenvalues  of  p  equal  or  exceed  the  corresponding  sums  for  a,  i.e., 

k  k 

P  >-  <7  ^  A;  >  ^  /q,  Wk  >  0,  (4.6) 

i= 0  i=0 

where  A*  and  /q  are  the  eigenvalues  of  p  and  a,  respectively,  arranged  in  de¬ 
creasing  order  (i.e.  A0  >  Ai  >  . . .).  If  p  y  a,  then  S(p)  <  S(a).  Thus,  if 
the  majorization  conjecture  holds,  it  would  imply  weak  conjecture  1.  As  a  test 
of  this  conjecture,  we  used  simulated  annealing  -  a  well-known  algorithm  to 
search  for  the  global  minimum  of  multivariate  functions  -  to  minimize  the  out¬ 
put  entropy  of  the  lossy  thermal-noise  channel.  We  used  a  variety  of  randomly- 
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generated  input  states  to  initiate  the  minimization,  and  for  each  case  the  final 
input  state  after  a  few  hundred  iterations  of  the  algorithm  was  extremely  close 
to  a  coherent-state,  as  proposed  by  conjecture  1.  In  fact,  we  found  for  all  the 
cases  we  studied,  that  not  only  did  the  output-state  at  every  successive  itera¬ 
tion  of  the  algorithm  have  a  lower  entropy  than  the  output-state  of  the  previous 
iteration,  the  eigenvalues  of  the  output-state  at  every  iteration  majorized  those 
for  the  preceding  iteration. 

4.  Lower  and  upper  bounds  —  A  suite  of  lower  and  upper  bounds  were  found 
for  the  output  entropy  of  the  lossy  thermal-noise  channel  that  support  the  weak 
conjecture  1.  The  details  and  plots  appeared  in  [10]. 

5.  Local  minimum  condition  —  In  support  of  the  strong  conjecture  1,  it  was 
also  shown  in  [10],  that  the  product  n-rnode  vacuum  state  is  a  local  minimum 
of  output  entropy  for  n  uses  of  the  lossy  thermal  noise  channel. 

6.  Thermal  state  best  of  all  Fock-state  diagonal  states  —  A  weaker  version 
of  conjecture  2  would  be  to  propose  that  the  thermal  state  input  yields  the 
lowest  output  entropy  among  all  other  input  states  (with  the  same  entropy  as 
required  by  conjecture  2)  that  are  diagonal  in  the  number-state  (Fock-state) 
basis.  We  verified  that  this  is  indeed  the  case  for  several  input  states  diagonal 
in  the  number-state  basis  (see  Fig.  4-1). 

4.3  Proof  of  all  Strong  Conjectures  for  Wehrl  En¬ 
tropy 

Inasmuch  as  we  were  unable  to  prove  the  strong  conjectures  for  von  Neumann  entropy, 
once  we  had  the  Wehrl-entropy  proofs  of  weak  conjectures  2  and  3  (see  Appendix  D) 
and  the  Wehrl-entropy  proof  of  the  strong  conjecture  1  [65],  we  wanted  to  generalize 
the  Wehrl-entropy  proofs  of  conjectures  2  and  3  to  their  respective  strong  forms  as 
well.  We  found  that  the  proofs  of  all  the  strong  Wehrl-entropy  conjectures  followed 
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Figure  4-1:  This  figure  presents  empirical  evidence  in  support  of  weak  conjecture 
2.  The  input  pA  =  1 0)  (0 1  is  in  its  vacuum  state.  For  a  fixed  value  of  S(pB), 
we  choose  three  different  inputs  pB ,  each  one  diagonal  in  the  Fock-state  basis,  i.e. 
pB  =  Y^=oPn\n) (n\  with  J2^=oPn  =  1-  The  three  different  inputs  pB  correspond  to 
choosing  the  distribution  {pn}  to  be  a  Binomial  distribution  (blue  curve),  a  Poisson 
distribution  (red  curve)  and  a  Bose-Einstein  distribution  (green  curve).  As  expected, 
we  see  that  the  output  state  p°  has  the  lowest  entropy  when  pB  is  a  thermal  state, 
i.e.  when  {pn}  is  a  Bose-Einstein  distribution. 


from  a  simple  observation  that  Wehrl  entropy  is  the  Shannon  entropy  of  the  Husirni 
function  (with  a  fixed  offset  term),  and  that  the  Entropy  Power  Inequality  (EPI)  [66] 
for  Shannon  entropy  encompasses  the  Wehrl  entropy  conjectures  as  special  cases. 

The  Wehrl  entropy  is  defined  for  an  n-mode  density  operator  p  in  a  way  analogous 
to  that  for  a  single-mode  state  (4.2), 

W(P)  4  -/’^(M)l,i(^(M))d2>  (4.7) 

=  h{Qp(n))  -  n\nn,  (4.8) 


where  the  Husirni  function  Qp(pi)  =  (pi\p\pi) /7in  is  a  2n-dimensional  probability  den¬ 
sity  function,  with  | ft)  =  [px)  ®  \p2)  <8>  . . .  <8>  \pn)  being  an  n-mode  coherent  state, 
pi  £  Cn.  Before  we  embark  on  the  proofs,  let  us  first  state  the  strong  versions  of  the 
minimum  output  entropy  conjectures  for  Wehrl  entropy. 

Strong  Conjecture  1  (Wehrl)  —  Consider  n  uses  of  the  beam  splitter  in  which 
the  output  modes  of  the  n  uses,  c-i  :  1  <  i  <  n,  are  related  to  the  input  modes  by 
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Eq.  f.l.  Let  the  input  modes  hi  :  1  <  i  <  n  be  in  a  product  state  of  n  mean-photon- 
number  K  thermal  states.  Then,  putting  all  the  modes  a,i  :  1  <  i  <  n  in  a  product 
of  n  vacuum  states  minimizes  the  output  Wehrl  entropy  of  the  joint  state  of  the 
modes  di  :  1  <  i  <  n,  and  the  minimum  output  entropy  is  given  by  W (p°n)  =  n(  1  + 
In  (1  +  (1  —  rj)K)). 

Strong  Conjecture  2  (Wehrl)  —  Consider  n  uses  of  the  beam  splitter  in  which  the 
output  modes  of  the  n  uses,  di  :  1  <  i  <  n,  are  related  to  the  input  modes  by  Eq.  f.l. 
Let  the  input  modes  a*  :  1  <  i  <  n  be  in  a  product  state  of  n  vacuum  states.  Also, 
the  Wehrl  entropy  of  the  joint  state  of  the  inputs  bi  :  1  <  i  <  n  is  constrained  to  be 
pBn  =  n(  1  +  ln  (1  +  K)).  Then,  putting  all  the  modes  bi  :  1  <  i  <  n  in  a  product  state 
of  mean-photon-number  K  thermal  states  minimizes  the  output  Wehrl  entropy  of  the 
joint  state  of  the  modes  Ci  :  1  <  i  <  n,  and  the  minimum  output  entropy  is  given  by 
W (p°n)  =  n{  1  +  In  (1  +  (1  -  v)K)). 

Strong  Conjecture  3  (Wehrl)  —  Consider  n  uses  of  the  beam  splitter  in  which  the 
output  modes  of  the  n  uses,  C  :  1  <  i  <  n,  are  related  to  the  input  modes  by  Eq.  j.l. 
Let  the  input  modes  a*  :  1  <  i  <  n  be  in  a  product  state  of  n  mean-photon-number  N 
thermal  states.  Also,  the  Wehrl  entropy  of  the  joint  state  of  the  inputs  bi  :  1  <  i  <  n  is 
constrained  to  be  pB"  =  n(  1  +  In  (1  +  K)).  Then,  putting  all  the  modes  bi  :  1  <  i  <  n 
in  a  product  state  of  mean-photon-number  K  thermal  states  minimizes  the  output 
Wehrl  entropy  of  the  joint  state  of  the  modes  Ci  :  1  <  i  <  n,  and  the  minimum  output 
entropy  is  given  by  W(pGn)  =  n(  1  +  In  (1  +  rjN  +  (1  —  p)K)). 

Theorem  4.1  (Entropy  Power  Inequality  (EPI))  [66]  —  Let  X  and  Y  be 

independent  random  m- vectors  taking  values  in  Mm,  and  let  Z  =  ^fr\X  +  a/1  —  rjY. 
Then, 

g2 h(Z)/m  >  r^e2h{X)/m  _)_  ^  _  .^)e2 h(Y)/m^  ^ 

where  h(X)  =  —  § px(x)\i\{px(x)\  &mx  is  the  Shannon  differential  entropy  of  X. 
Equality  in  (4.9)  holds  if  and  only  if  X  and  Y  are  both  Gaussian  random  vectors 
with  proportional  covariance  matrices. 

Corollary  4.2  [Shapiro,  2007]  —  Consider  n  uses  of  the  beam  splitter  in  which  the 
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output  modes  of  the  n  uses,  c  =  {c*  :  1  <  i  <  n},  are  related  to  the  input  modes 
a  =  {di  :  1  <i  <n}  and  b  =  j&j  :  1  <  %  <  nj  by  Eq.  4.1.  Let  pA'\  pB'x  and  pc”  be 
the  joint  density  operators  of  the  n  uses  of  the  inputs  and  the  output  respectively. 
Then, 

eW^)/n  >  veW(^)/n  +  (1  _  v)eW(pBn)ln^  (4.10) 

where  W ( p )  is  the  Wehrl  entropy  of  the  n-mode  state  p. 

Proof  —  Let  us  first  recall  a  few  definitions.  The  antinormally  ordered  characteristic 
function  \a( C)  °f  an  n-mode  density  operator  p  is  given  by: 

Xa( 0  =  tr  (pe^ctaeCat)  ,  (4.11) 

where  C,  =  (£1; . . .  ,  Cn)T  is  a  column  vector  of  n  complex  numbers.  Also,  the  anti¬ 
normally  ordered  characteristic  function  Xa(C)  and  the  Husimi  function  Qp(pi )  = 
(pi\p\pi)/nn  of  a  state  p  form  a  2-D  Fourier- Transform  Inverse-Transform  pair: 


(4.12) 

=  T.  j  “A  2”C 

(4.13) 

with  pi,  £  G  Cn.  As  the  two  n-use  input  states  pA "  and  pB'x  are  statistically  indepen¬ 
dent,  Eq.  4.11  implies  that  the  output  state  characteristic  function  is  a  product  of 
the  input  state  characteristic  functions  with  scaled  arguments: 


„  a  ti  A  nfi  _ 

x\  K )  =  X-A  (vVC)^  (LvXC)  (4.14) 

From  Eq.  4.14,  using  the  multiplication-convolution  property  of  Fourier  transforms 
(FT),  we  get 

(^)  *  (yfe;)  (415) 

where,  we  used  the  scaling-property  of  FT:  Xa(V^O  < - >  (1  / pn)Q p(pi / s/v) ■ 
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Now,  as  the  Husimi  function  Qp(-)  is  a  proper  probability  density  function,  we  can 
define  two  2n- dimensional  statistically-independent  real  random  vectors  X  and  Y, 
with  distributions  px{fi)  —  and  Py{p)  =  and  dehne  the  linear 

combination  Z  =  s/rjX  +  a/1  —  rjY.  Thus,  the  p.d.f.  of  Z  is  given  by  Pz{n)  = 
QpCn(fi)  as  found  from  Eq.  (4.15).  Using  Eq.  (4.8),  we  have  that  the  differential 
entropies  of  X,  Y ,  and  Z  can  be  expressed  in  terms  of  the  Wehrl  entropies  of  the 
n-rnode  quantum  systems  7ln,  Bn  and  Cn  respectively,  by  h(X)  =  W(pAn )  +  n  In  7r, 
h(Y)  =  W(pBn)+nhiir,  and  h(Z)  =  W(pcn)+n ln7r.  Using  these  relations,  Corollary 
4.2  is  immediately  equivalent  to  the  Entropy  Power  Inequality  (Theorem  4.1)  with 
m  =  2  n. 

Proof:  Strong  Conjecture  1  (Wehrl)  —  The  input  a  is  given  to  be  in  a  pure 
state.  Thus  the  Wehrl  entropy  of  the  input  a  is  given  by  [67] 

W(pAn)  =  n.  (4.16) 

The  state  of  the  input  b  is  in  a  product  of  K -photon  thermal  states.  Therefore, 

pBn  =  (JL  J  e^a?,K\a){a\c\  2a^j  ,  (4.17) 

Qpb<v)  =  (7r(1  +  ^))"e~HV(1+g)»and 
W(pB n)  =  n(l  +  ln(l +  /!)),  (4.18) 

Therefore,  Corollary  4.2  implies  the  following  bound: 

ew(pcn)/n  >  ve  +  _  ??)ei+in(i+U)  (4.19) 

which  on  taking  the  natural  logarithm  of  both  sides  translates  into  a  lower  bound  for 
the  Wehrl  entropy  of  the  output  c, 

W(p°n)  >  n In  (e(?7  +  (1  —  ?7)eln(1+A^))  (4.20) 

=  n(l  +  ln(l  +  (1  —  rj)K)).  (4-21) 
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It  is  readily  verified  that  a  product  of  n  vacuum  states  at  the  input  a,  i.e.  pA"  = 
(|0)(0|)'8>n  achieves  the  lower  bound  (4.21),  for  in  this  case  QpAn  (^)  =  (l/7rn)e 
and  the  convolution  (4.15)  yields  QpCn(n)  =  l/(7r(l  +  (1  —  rj)K ))ne~l/1lid1+(1-r?)A')) 
which  gives  W(pG")  =  n(  1  +  ln(l  +  (1  —  rj)K)).  Hence,  a  product  vacuum  state  for 
the  input  a  achieves  minimum  output  entropy  W(pG"j,  and  the  minimum  output 
entropy  is  given  by 

W(pG")  =  n(  1  +  ln(l  +  (1  -  rj)K)).  (4.22) 

Proof:  Strong  Conjecture  2  (Wehrl)  —  The  input  a  is  given  to  be  in  a  an  n- 

mode  vacuum  state.  Thus  the  Husimi  function  and  the  Wehrl  entropy  of  the  input  a 
are  given  by 

QpWm)  =  <4-23) 

TTn 

W{pA" )  =  n.  (4.24) 

The  state  of  the  input  b  is  mixed  with  fixed  Wehrl  entropy  W (p5”)  =  n(l+ln(l  +  /l )). 
Therefore,  Corollary  4.2  implies  the  following  bound: 

eW(pcn)/n  >  ^  +  (]_  _  rr]y+\n(l+I<) ^ 

which  on  taking  the  natural  logarithm  of  both  sides  translates  into  a  lower  bound  for 
the  Wehrl  entropy  of  the  output  c, 

W(p°n )  >  n  In  (e(p  +  (1  —  p)eln('1+A^))  (4.26) 

=  n(l  +  ln(l  +  (1  —  rj)K)).  (4.27) 

It  is  readily  verified  that  a  product  of  n  /7-photon  thermal  states  at  the  input 
b,  i.e.  pB"  =  ^(1/7 tK)  f  e-l"l2/A:|a)(a:|d2a:j  achieves  the  lower  bound  (4.27), 
for  in  this  case  QpBn{n)  =  (l/(7r(l  +  K))n)e~^2^1+K\  and  the  convolution  (4.15) 
yields  QpC™(n)  =  (l/(7r(l  +  (1  —  rj)K ))n)e^^2 which  gives  W(pc")  = 
n ( 1  +  ln(l  +  (1  —  r})K)).  Hence,  a  product  vacuum  state  for  the  input  a  achieves 
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minimum  output  entropy  W(pc" ),  and  the  minimum  output  entropy  is  given  by 


W(p°n)  =  n(l  +  ln(l  +  (1  -  rj)K)).  (4.28) 

Proof:  Strong  Conjecture  3  (Wehrl)  —  The  input  a  is  given  to  be  in  a  an  n- 

mode  product  thermal  state  with  N  photons  on  an  average  in  each  mode.  Thus  the 
Husimi  function  and  the  Wehrl  entropy  of  the  input  a  are  given  by 

QpAn(n)  =  _ _ -L— _ _e-M2/(i+*l  and  (4.29) 

W(pA")  =  n(l  +  ln(l  +  N)).  (4.30) 

The  state  of  the  input  b  is  mixed  with  fixed  Wehrl  entropy  W ( pB'1 )  =  n(l+ln(l  +  K )). 
Therefore,  Corollary  4.2  implies  the  following  bound: 

eW(pcn)/n  >  ^el+ln(l+JV)  +  (1  _  n)e1+ln(1+K\  (4.31) 

which  on  taking  the  natural  logarithm  of  both  sides  translates  into  a  lower  bound  for 
the  Wehrl  entropy  of  the  output  c, 

W{p°n )  >  nln  (e(p(l  +  N)  +  (1  —  rf){l  +  K)))  (4.32) 

=  n(l  +  ln(l  +  rjN  +  (1  —  rj)K)).  (4.33) 

It  is  readily  verihed  that  a  product  of  n  iC-photon  thermal  states  at  the  input  b, 

i.e.  pBn  =  ^(1/7 tK)  f  e“l"l2/A  |a)(o:|d2aj  achieves  the  lower  bound  (4.33),  for  in 

this  case  QpBn( n)  =  (1/ (tt(1  +  K))u)e~^ ^1+k\  and  the  convolution  (4.15)  yields 
QpCn(p,)  =  (l/(7r(l  +  rjN  +  (1  —  p)K))n)e~^2 ^1+vN+^1~1^K\  which  gives  W{p°n )  = 
n(l  +  ln(l  +r)N+  (1  —  rj)K )).  Hence,  a  product  vacuum  state  for  the  input  a  achieves 
minimum  output  entropy  W(pc'" ),  and  the  minimum  output  entropy  is  given  by 

W(pcn)  =  n(  1  +  ln(l  +  rjN  +  (1  -  p)K)).  (4.34) 
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Chapter  5 


The  Entropy  Photon-Number 
Inequality  and  its  Consequences 


In  the  previous  chapter  we  saw  that  the  Entropy  Power  Inequality  (EPI)  can  be  used 
to  prove  all  the  Wehrl-entropy  versions  of  the  minimum  output  entropy  conjectures 
as  special  cases.  The  reason  Wehrl  entropies  of  the  input  and  output  states  of  a 
beam  splitter  admit  an  EPI- like  inequality  (corollary  4.2),  is  that  Wehrl  entropy  is 
essentially  the  Shannon  entropy  of  the  Husimi  function,  and  the  Husimi  function  of  the 
output  state  of  a  beam  splitter  is  the  convolution  (with  properly  scaled  arguments) 
of  the  Husimi  functions  of  the  two  input  states  —  much  like  how  the  probability 
distribution  function  (p.cl.f.)  of  the  weighted  sum  of  two  random  variables  is  the 
convolution  (with  properly  scaled  arguments)  of  the  p.d.f.’s  of  the  two  individual 
random  variables.  In  order  to  prove  the  minimum  output  entropy  conjectures  for 
the  von  Neumann  entropy  measure,  therefore,  it  is  natural  to  conjecture  an  EPI-like 
inequality  similar  to  that  in  corollary  4.2,  that  would  supersede  all  the  minimum 
output  entropy  conjectures. 

In  section  5.1  below,  we  restate  the  EPI  in  three  equivalent  forms,  in  terms  of  the 
“entropy  powers”  of  the  random  variables.  In  section  5.2  we  first  restate  corollary 
4.2  in  terms  of  what  we  define  as  “Wehrl-entropy  photon-numbers”  of  the  quantum 
states,  in  analogy  to  the  notion  of  entropy  power  of  a  random  variable  introduced 
in  section  5.1.  After  that  we  state  two  equivalent  forms  of  our  conjectured  Entropy 
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Photon- number  Inequality  (EPnl).  Section  5.3  describes  how  the  EPnl,  if  true,  would 
immediately  imply  all  the  minimum  output  entropy  conjectures  from  Chapter  4.  In 
section  5.4,  we  describe  some  recent  progress  that  we  have  made  towards  a  proof  of 
the  EPnl. 


5.1  The  Entropy  Power  Inequality  (EPI) 


Because  a  real- valued,  zero-mean  Gaussian  random  variable  U  has  differential  (Shan¬ 
non)  entropy  given  by  h(U)  =  |  ln(27re(f/2)),  where  the  mean-squared  value  (I/2)  is 
considered  to  be  the  power  of  U,  we  can  define  the  entropy  power  of  a  random 
variable  X,  P(X)  to  be  the  mean-squared  value  (X2)  of  the  zero-mean  Gaussian 
random  variable  X  having  an  entropy  equal  to  the  entropy  of  X,  i.e.  h( X)  =  h(X) 
and  P(X)  =  ( l/2ire)e2h('X\  Further,  let  X  and  Y  be  statistically  independent,  n- 
dimensional,  real-valued  random  vectors  that  possess  differential  entropies  h(X.)  and 
h( Y)  respectively.  The  entropy  powers  of  X  and  Y  are  defined  analogously: 

2h(X)/n  „2h(Y)/n 

and  =  ^ 

In  this  way,  an  n-dimensional,  real-valued,  random  vector  X  comprised  of  indepen¬ 
dent,  identically  distributed  (i.i.d.),  real- valued,  zero-mean,  variance-P(X),  Gaussian 
random  variables  has  differential  entropy  h(X)  =  /i(X).  We  can  similarly  define  an 
i.i.d.  Gaussian  random  vector  Y  with  differential  entropy  h( Y)  =  h( Y).  We  define 
a  new  random  vector  by  the  convex  combination 

Z  =  ^X+^l^Y,  (5.2) 

where  0  <  r)  <  1.  This  random  vector  has  differential  entropy  h( Z)  and  entropy 
power  P(Z).  Furthermore,  let  Z  =  y^X  +  \J  1  —  r/Y.  Three  equivalent  forms  of  the 
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Entropy  Power  Inequality  (EPI),  see,  e.g.,  [68],  are  given  by 


P(Z) 

> 

r?P(X)  +  (l-r?)P(Y), 

(5.3) 

K z) 

> 

K  Z), 

(5.4) 

K  z) 

> 

p  h(X)  +  (1  -p)h(Y). 

(5.5) 

5.2  The  Entropy  Photon-Number  Inequality  (EPnl) 

Let  a  —  [  ai  a2  •  •  •  an  ]  an(l  b  =  [  h  b2  •  ■  ■  bn  ]  be  vectors  of  photon  annihila¬ 
tion  operators  for  a  collection  of  2 n  different  electromagnetic  field  modes  of  frequency 
u  [15].  Let  the  joint  states  of  the  modes  associated  with  a  and  b  be  statistically 
independent  of  each  other,  and  thus  be  given  by  the  prodnct-state  density  operator 
Pab  =  pa  <E>  Pbi  where  pa  and  pb  are  the  density  operators  associated  with  the  a 
and  b  modes,  respectively.  The  von  Neumann  entropies  of  the  a  and  b  modes  are 
S(pa )  =  —  tr[pa  ln(/3a)]  and  S(pb)  =  — trfp^  ln(pb)].  We  define  a  new  vector  of  photon 
annihilation  operators,  c—[c\  c2  ■  ■  ■  cn  ],  by  the  convex  combination 

c  =  y/rj  a  +  y/l  —  p  b,  for  0  <  r)  <  1,  (5.6) 

and  use  pc  to  denote  its  density  operator.  This  is  equivalent  to  saying  that  ct  is  the 
output  of  a  lossless  beam  splitter  whose  inputs,  a*  and  b,,  couple  to  that  output  with 
transmissivity  r/  and  reflectivity  1  —  rj,  respectively. 


5.2.1  EPnl  for  Wehrl  entropy:  Corollary  4.2 

In  analogy  to  the  notion  of  entropy  power  of  a  random  variable,  let  us  define  the 
Wehrl-entropy  photon  numbers  of  the  n-rnode  density  operators  pa  and  pb  as 
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follows: 


Nw{pa)  =  9w 
Nw(Pb)  =  9w 


(5.7) 

(5.8) 


where  gw(N)  —  l  +  ln(l  +  N)  is  the  Wehrl  entropy  of  the  thermal  state  px  with  mean 
photon  number  N  and  (x)  =  eX~ 1  —  1  is  the  well-defined  inverse  function  of  gw(') 
for  x  >  0.  Thus,  if  =  &i=\PTai  and  pb  =  (H))=i  Prb.,  where  pxa.  is  the  thermal 
state  of  average  photon  number  Nw(pa )  for  the  a,:  mode  and  prh  is  the  thermal  state 
of  average  photon  number  Nw(pb)  for  the  bi  mode,  we  have  that  W(pa)  =  W(pa ) 
and  W(p-b)  =  W(pb). 

For  the  vector  of  photon  annihilation  operators  c  =  [  c\  c 2  •  •  •  cn  ]  that  is  given 
by  the  convex  combination  (5.6)  it  is  straightforward  to  see  that  Eqs.  (5.3)-(5.5)  can 
be  recast  into  the  following  three  equivalent  forms,  that  we  call  the  Wehrl-Entropy 
Photon-number  Inequality  (WEPnl): 


Nw  (fie) 

> 

t}Nw(Pa 

,)  +  (1  —  rj)Nw(pb), 

(5.9) 

W(pc) 

> 

W(h), 

and 

(5.10) 

WlftJ 

> 

rtW(M 

+  (1  —  v)W  (pb), 

(5.11) 

where  pb  =  (S),'=  1  Pt,,  with  pxc.  being  the  thermal  state  of  average  photon  number 
r)Nw(pa )  +  (1  —  rj)Nw(Pb)  for  q.  Equation  (5.9)  is  the  same  as  Corollary  4.2. 

5.2.2  EPnl  for  von  Neumann  entropy:  Conjectured 

Let  us  define  the  entropy  photon  numbers  of  the  n-rnode  density  operators  pa 
and  pb  as  follows: 

N(pa)  =  9 and  (5-12) 

N(pb)  =  (5-13) 
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where  g  1(y)  is  the  well-defined  inverse  function  of  y  —  g{x)  =  (1  +  x)  ln(l  +  x)  — 
xln(x),  for  x  >  0.  Thus,  if  pa  =  <8>”=iPT0i  and  pb  =  ®)'=l  Pt,h,  where  pTa.  is  the 
thermal  state  of  average  photon  number  N(pa)  for  the  a*  mode  and  prb  is  the  thermal 
state  of  average  photon  number  N(pb )  for  the  bb  mode,  we  have  S(pa)  =  S(pa)  and 

S(ph)  =  S(pb). 

For  the  vector  of  photon  annihilation  operators  c  =  [  q  c2  •  •  •  cn  ]  that  is 
given  by  the  convex  combination  (5.6),  we  conjecture  the  following  two  equivalent 
forms  of  the  Entropy  Photon-number  Inequality  (EPnl): 

N(pc)  >  r)N(pa)  +  (1  —  r))N(pb)  (5.14) 

S(pc)  >  S(pc),  (5.15) 

where  p-c  =  (S)"=i  Pt,  with  ptc.  being  the  thermal  state  of  average  photon  number 
rjN(pa )  +  (1  —  g)N{pb)  for  q.  By  analogy  with  the  classical  EPI  and  the  quantum 
WEPnl,  we  might  expect  there  to  be  a  third  equivalent  form  of  the  quantum  EPnl, 
viz., 

S(pc)  >  rjS(pa)  +  (1  -  rj)S(pb).  (5.16) 

It  is  easily  shown  (see  below)  that  (5.14)  implies  (5.16),  but  we  have  not  been  able 
to  prove  the  converse.  Indeed,  we  suspect  that  the  converse  might  be  false. 

Proof  of  equivalence  between  different  forms  of  the  EPnl 

Below,  we  prove  the  equivalence  of  the  two  forms  of  the  EPnl  in  Eqs.  (5.14)  and 
(5.15),  and  we  also  prove  that  (5.14)  implies  (5.16).  If  we  can  also  prove  that  (5.16) 
implies  (5.14),  all  the  three  forms  of  the  conjectured  EPnl  would  be  equivalent. 

1.  To  show  that  (5.14)  implies  (5.15),  assume  (5.14)  is  true: 

N(pc)  >  r}N(pa)  +  (1  -  V)N(pb)  (5.17) 

=  vN(pa)  +  (1  -  g)N(pb)  (5.18) 
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Now,  if  =  f>h®Pb  is  the  joint  density  operator  of  the  a  and  b  modes,  we  find 
that  the  state  of  the  c  modes  is  p-c  =  (££)™=l  Ptc  ,  where  prCi  is  a  thermal  state 
with  average  photon  number  given  by  N(pc)  =  r)N(pa)  +  (1  —  ri)N(p^),  so  that 
S(p-c)  =  ng[N(p-c)}.  Thus,  from  (5.18)  we  get  N(pc)  >  N(p~c )  =  g~l{S(p~c)/n). 
Taking  g(-)  of  both  sides  of  this  inequality  completes  the  proof. 


2.  To  show  that  (5.15)  implies  (5.14),  assume  (5.15)  is  true: 

N(pc)  =  g~\S(pc)/n ) 

>  g-1  (S (f>c) / n)  =  g^lgivNipa)  +  (i  -  vWh))] 

=  gN{pa)  +  (l  -  v)N(ph) 

=  r}N(pa)  +  (l-r})N(pb),  (5.19) 

where  the  inequality  is  due  to  p“1(S')  being  a  monotonically  increasing  function 
of  S,  and  the  proof  is  complete. 


3.  To  show  that  (5.14)  implies  (5.16),  assume  that  (5.14)  is  true. 

that  N(pc )  >  r) N(pa)  +  (1  —  r})N(pb),  so  that 

We  then  have 

Stfc) 

=  ng[N(pc )]  >  ng[r]N(pa)  +  (1  -  rj)N(pb)\ 

(5.20) 

>  Vng[N(pa )]  +  (1  -  v)ng[N(pb)] 

(5.21) 

=  r}S(pa)  +  (l-ri)S(pb), 

(5.22) 

where  the  second  inequality  follows  from  g(N)  being  concave,  and  the  proof  is 
complete. 
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5.3  Relationship  of  the  EPnl  with  the  Minimum 


Output  Entropy  Conjectures 

More  important  than  whether  or  not  (5.16)  is  equivalent  to  (5.14)  and  (5.15)  is  the 
role  of  the  EPnl  in  proving  classical  information  capacity  results  for  Bosonic  chan¬ 
nels.  In  particular,  the  EPnl  (5.14)  provides  simple  proofs  of  the  strong  versions  of 
the  three  minimum  output  entropy  conjectures  we  stated  in  Section  4.1.  These  con¬ 
jectures  are  important  because  proving  minimum  output  entropy  conjecture  1  also 
proves  the  conjectured  capacity  of  the  thermal-noise  channel  [9],  proving  minimum 
output  entropy  conjecture  2  also  proves  the  conjectured  capacity  region  of  the  Bosonic 
broadcast  channel  [12],  and  proving  minimum  output  entropy  conjecture  3  also  proves 
the  conjectured  capacity  region  of  the  Bosonic  broadcast  channel  with  additive  ther¬ 
mal  noise  (see  Chapter  3).  Furthermore,  as  we  have  shown  in  Chapter  3,  proving 
minimum  output  entropy  conjecture  2  also  establishes  the  privacy  capacity  of  the 
Bosonic  wiretap  channel  and  the  single-letter  quantum  capacity  of  the  lossy  Bosonic 
channel.  Before  we  prove  that  the  EPnl  subsumes  all  the  minimum  output  entropy 
conjectures,  we  restate  the  conjectures  below  for  ease  of  reference. 

Minimum  Output  Entropy  Conjecture  1  —  Let  a  and  b  be  n-dimensional 
vectors  of  annihilation  operators,  with  joint  density  operator  pab  =  (|'0)aa(t/;|)  ® 
Pb,  where  \i^)a  is  an  arbitrary  zero-mean-held  pure  state  of  the  a  modes  and  pb  = 
®i=iPTb.  with  pxb.  being  the  h,  mode’s  thermal  state  of  average  photon  number  N. 
Define  a  new  vector  of  photon  annihilation  operators,  c  —  [  c\  c2  •••  cn  ],  by 
the  convex  combination  (5.6)  and  use  pc  to  denote  its  density  operator  and  S(pc )  to 
denote  its  von  Neumann  entropy.  Then  choosing  \^)a  to  be  the  n-rnode  vacuum  state 
minimizes  S(pc).  The  resulting  minimum  output  entropy  is  S(pc)  =  ng((  1  —  p)N). 

Minimum  Output  Entropy  Conjecture  2  —  Let  a  and  b  be  n-dimensional 
vectors  of  annihilation  operators  with  joint  density  operator  pab  =  (iV’)aa(V’l)  <H>  Pb, 
where  \if))a  =  (^)''= ,  |0)a;  is  the  n-mode  vacuum  state  and  pb  has  von  Neumann  entropy 
S(pb)  =  ng(K )  for  some  K  >  0.  Define  a  new  vector  of  photon  annihilation  operators, 
c  =  [  ci  c2  •••  cn  ],  by  the  convex  combination  (5.6)  and  use  pc  to  denote  its 
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density  operator  and  S(pc)  to  denote  its  von  Neumann  entropy.  Then  choosing  pb  = 
^=1pTb  with  prh  being  the  6*  mode’s  thermal  state  of  average  photon  number  K 
minimizes  S(pc).  The  resulting  minimum  output  entropy  is  S(pc)  =  ng((  1  —  rj )K). 

Minimum  Output  Entropy  Conjecture  3  —  Let  a  and  b  be  n-dimensional 
vectors  of  annihilation  operators  with  joint  density  operator  pab  =  pa  <S>  pb,  where 
Pa  =  <g£  -i  pTa  with  pTa.  being  the  ab  mode’s  thermal  state  of  average  photon  number 
N,  and  pb  has  von  Neumann  entropy  S(pb)  =  ng(K)  for  some  K  >  0.  Define  a 
new  vector  of  photon  annihilation  operators,  c  —  [  c\  c2  •  •  •  c„  ],  by  the  convex 
combination  (5.6)  and  use  pc  to  denote  its  density  operator  and  S(pc)  to  denote  its 
von  Neumann  entropy.  Then  choosing  pb  =  (&™=1pTb.  with  pTb  being  the  6,;  mode’s 
thermal  state  of  average  photon  number  K  minimizes  S(pc).  The  resulting  minimum 
output  entropy  is  S(pc)  =  ng(r}N  +  (1  —  rj)K). 

To  see  that  the  EPnl  encompasses  all  three  of  the  preceding  minimum  output 
entropy  conjectures,  we  begin  by  using  the  premise  of  conjecture  1  in  (5.14).  Because 
the  a  modes  are  in  a  pure  state,  we  get  S(pa)  =  0  and  hence  the  EPnl  tells  us  that 

N(pc)  >  (1  -  g)N(pb)  =  (1  -  g)N.  (5.23) 

Taking  g(-)  on  both  sides  of  this  inequality,  we  get  S(pc)/n  >  g[(  1  —  r/) TV] .  But,  if 
\ip)a  is  the  n-rnode  vacuum  state,  we  can  easily  show  that  pc  =  (&™=1Ptc.,  with  pTc. 
being  the  c*  mode’s  thermal  state  of  average  photon  number  (1  —  rj)N.  Thus,  when 
| 'll)) a  is  the  n-rnode  vacuum  state  we  get  S(pc )  =  ng[(  1  —  77) AT] ,  which  completes  the 
proof. 

Next,  we  apply  the  premise  of  conjecture  2  in  (5.14).  Once  again,  the  a  modes 
are  in  a  pure  state,  so  we  get 

N(pc)  >  (1  -  g)N(pb )  =  (1  -  7 i)K,  (5.24) 


and  hence  S(pc)/n  >  g[(l  —  rj)K\.  But,  taking  pb  =  (^)”=1  prb  ,  with  prb  being  the  bt 
mode’s  thermal  state  of  average  photon  number  K,  satisfies  the  premise  of  minimum 
output  entropy  conjecture  2  and  implies  that  pc  =  (^)”=1  pTc  ,  with  pTc.  being  the 
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c%  mode’s  thermal  state  of  average  photon  number  (1  —  rj)K.  In  this  case  we  have 
S(pc)  =  ng [(1  —  rj )K\,  which  completes  the  proof. 

Finally,  we  apply  the  premise  of  conjecture  3  in  (5.14).  The  input  state  pa  = 
®”=1  f>Tai  with  pTa  being  the  ab  mode’s  thermal  state  of  average  photon  number  N . 
So  we  get 

N(pc)  >  r/N(pa)  +  (1  -  rj)N(pb)  =  VN  +  (1  -  v)K,  (5.25) 

and  hence  S(pc)  jn  >  g[r}N  +  (1  —  rj) K).  But,  taking  pb  =  (S^ILi  Pt6.,  with  pTb  being 
the  bi  mode’s  thermal  state  of  average  photon  number  K,  satisfies  the  premise  of 
minimum  output  entropy  conjecture  3  and  implies  that  pc  =  &)■= \  PtCi  ,  with  pTc. 
being  the  ct  mode’s  thermal  state  of  average  photon  number  r]N  +  (1  —  rj)K.  In  this 
case  we  have  S(pc )  =  ng[r}N  +  (1  —  rj )K],  which  completes  the  proof. 

5.4  Evidence  in  Support  of  the  EPnl 

As  opposed  to  the  extensive  body  of  evidence  we  have  that  supports  the  validity  of 
conjectures  1  and  2,  we  do  not  yet  have  nearly  as  much  evidence  for  the  conjectured 
EPnl.  The  EPnl  might  turn  out  to  be  harder  to  prove  than  our  earlier  conjectures, 
because  it  is  a  more  powerful  result.  However,  there  is  a  huge  existing  literature  on 
various  ways  to  prove  the  classical  EPI  [68].  By  drawing  upon  those  approaches  we 
may  be  able  to  prove  the  quantum  EPnl.  Below,  we  summarize  the  evidence  we  have 
collected  so  far  supporting  the  validity  of  the  EPnl. 

5.4.1  Proof  of  EPnl  for  product  Gaussian  state  inputs 

A  natural  starting  point  in  trying  to  prove  the  EPnl  in  its  most  general  form  would 

be  to  prove  it  when  the  input  states  pa  and  pb  (and  thus  the  output  state  pc)  are 

restricted  to  be  Gaussian  states1.  Even  though  we  can  prove  strong  conjectures  1  and 

2  when  restricted  to  Gaussian  input  states  [12],  we  haven’t  been  able  to  prove  the 

EPnl  with  this  input  restriction.  Nevertheless,  we  have  been  able  to  prove  the  EPnl 

1  Gaussian  states  are  states  that  are  completely  described  by  all  the  first  and  the  second  order 
moments  of  their  field  operators.  For  a  quick  overview  of  Gaussian  states,  see  [69]. 
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for  single-mode  states  (n  =  1)  with  the  Gaussian- input  restriction.  In  other  words, 
we  have  proved  the  EPnl,  when  both  the  inputs  pa  and  pb  are  tensor  products  of 
single-mode  Gaussian  states. 

Theorem  5.1:  [EPnl  for  product  Gaussian  state  inputs:  Guha,  Erkmen,  2008] 
Single-mode  fields  a  and  b  excited  in  statistically  independent  Gaussian  states  pa  and 
pb  are  inputs  to  a  beam  splitter  of  transmissivity  p,  resulting  in  the  output  mode, 
c  =  yd/a  +  y/1  —  pb,  in  a  Gaussian  state  pc.  Then  the  following  inequality  holds: 

9~l  ( S(pc ))  >  pg _1  ( S(pa ))  +  (1  -  p)g~1  ( S(pb )) ,  (5.26) 


with  equality  when  a  and  b  are  in  thermal  states. 

Proof  -  The  von  Neumann  entropy  S(pa)  is  independent  of  the  mean-field  (a). 
Hence  without  loss  of  generality,  let  us  suppress  the  mean-field  values  of  all  the  states 
and  assume  that  (a)  =  (b)  =  (c)  =  0.  For  a  single  mode  Gaussian  state  pa,  with 
mean- field  (a)  =  0,  and  covariance  matrix2, 

K  =  (  (A®2)  \  =  (  ^  \  =  (  1  +  Aa  Pa  \  (5  27) 

y  (Aat2)  (AafAa)  J  y  (at2)  (afa)  J  y  P*  Na  J 

where  A  a  =  a  —  (a),  the  Wigner  characteristic  function  Xw(0  =  Tr  ^pae_<’*a+^at  j 
can  be  shown  to  be  given  by  (see  Appendix  A) 

xfv(0  =  exp  (Vc  -  aO  +  X(p; c2)  -  (Na  +  ^)IC|2)  •  (5.28) 


Let  the  input  state  pb  be  a  Gaussian  state  with  mean-field  (b)  =  0,  and  covariance 
matrix, 


Kb 


A 


(A6A6t)  (Kb2) 

(A5t2)  (A6tA6) 


(66+)  (b2) 

(6t 2)  (StS) 


1  +  (5.29) 

n*  Nb  j 


2The  commutation  relation  [a,  fit]  =  1  implies  that  (ActAcd)  =  1  +  (AafAa).  Also,  for  a  zero 
mean  field  ((a)  =  0)  state,  (Aaf  Aa)  =  (afa)  is  the  mean  photon  number  in  the  state,  hence  justifying 
the  notation  Na,  as  we  can  always  choose  (a)  =  0  because  von  Neumann  entropy  is  invariant  to 
shifts  in  the  mean  field. 
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Using  the  beam  splitter  transformation  c  =  y/fjci+  yjl  —  pb,  and  the  fact  that  a  and  b 
are  independent  modes,  we  can  compute  the  Wigner  characteristic  function  of  pc  via 
Xw( 0  —  XwiV^OXwiV^  ~  vO-  Thus  it  is  easy  to  see  that  pc  is  a  Gaussian  state 
with  mean  held  (c)  =  s/pa  +  a/1  —  p/3,  and  covariance  matrix  Kc  =  pKa  +  (1  —  rj) Kb, 


i.e., 


Kr 


1  +  Nc  Pc 
P*c  Nc 


(5.30) 


with  Nc  =  r)Na  +  (1  -  p)Nb,  and  Pc  =  pPc  +  (1  -  p )Pb. 

When  the  phase-sensitive  (off-diagonal)  term  in  the  covariance  matrix  Ka,  Pa  =  0, 
the  Gaussian  state  pa  is  a  thermal  state,  whose  Wigner  characteristic  function  is  cir¬ 
cularly  symmetric  Gaussian  about  its  mean.  Using  the  symplectic  diagonalization3 
pa  =  UpT,NaW  where  pr,Na  is  a  zero-mean  thermal  state  with  mean  photon  number 
Na  =  a J (Na  +  1/2)2  —  |Pa|2  — 1/2,  we  have  S(pa)  =  g(Na).  Using  symplectic  diagonaf- 
izations  of  pb  and  pc,  we  similarly  have  S(pb)  =  g(Nb)  =  g( a/ (Nb  +  1/2)2  —  \Pb\2  — 1/2) 
and  S(pc )  =  g(Nc)  =  g(  a/ (Nc  +  1/2)2  —  |PC|2  — 1/2).  ffence,  the  statement  of  theorem 
5.1  is  equivalent  to  the  following: 

For  complex  numbers  Pa,  Pb  G  C,  and  non-negative  real  numbers  Na,  Nb  e  M+,  it 
follows  that 


\j(Nc  +  1/2)2  -  |PC|*  -  1  >  »  N(K  +  1/2)2  -  |F„P  -  0 

+(1  -  r,)  (y(ATt+ 1/2)2  HAI2_  0  _  (5.32) 

where  Pc  =  gPa  +  (1  —  rj )Pb  and  Nc  =  pNa  +  (1  —  rj)Nb. 


3Any  n-mode  Gaussian  state  pa  can  be  shown  to  be  unitarily  equivalent  to  a  tensor-product  of 
n  independent  thermal  states  with  mean  photon  numbers  A,;,  for  1  <  *  <  n,  i.e. 

Pa  =  U  u\  (5.31) 

with  being  a  thermal  state  of  average  photon  number  A;.  The  A i  are  known  as  the  symplectic 
eigenvalues  of  the  Gaussian  state  pa .  Because  a  unitary  operation  leaves  the  von  Neumann  entropy 
of  a  state  unchanged,  S(pa)  =  )U”=1  g(Xi).  See  [70]  for  details  of  a  systematic  algorithm  to  compute 
the  symplectic  eigenvalues  A for  an  arbitrary  n-mode  Gaussian  state,  given  its  covariance  matrix 
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Lemma  5.2  —  For  non-negative  real  numbers  mi,  m2,  ri,  72  and  a  G  M,  satisfying 
m,  >  rt  for  i  =  1,  2, 

mim2  +  rir2  cos  a  >  1/ (m2  —  r2)(m \  —  rf).  (5.33) 

Proof —  Since  —1  <  cos  a  <  1,  miW2  +  r  1  ry  cos  a  >  mirri2  —  rir2.  Now, 

(mir2  —  m2'ri)2  > 

m1r2  +  m2r1  > 
m\m\  +  r2r2  —  2mim2r1r2  > 
rriimi  —  rir2  > 
m1m2  +  rir2  cos  a  > 

Using  Lemma  5.2  with  the  substitutions  mi  =  iVa  +  1/2,  m2  =  iVb  +  1/2,  Pa  = 
Tie*611,  =  T^e*6*2  and  a  =  61!  —  02,  we  get4, 

(Ar«+l)(A't+))+s(p«n*)  >  J ((% + lr  -  imp)  + (y  -  1m2) ,  (5.39) 

which  can  be  seen  to  be  equivalent  to  Eq.  (5.32)  with  a  few  steps  of  simplification. 
It  is  readily  verified  from  Eq.  (5.32),  that  the  inequality  (5.26)  is  met  with  equality 
when  Pa  =  Pb  =  Pc  =  0,  i.e.  all  the  input  and  output  states  are  thermal  states. 


0,  or  (5.34) 

2mim2r1r2,  or  (5.35) 

mynig  +  r1r2  —  m1r2  —  m2r1,  or  (5.36) 
■sj {m\  -  rf)(m|  -  r$).  (5.37) 

\l(ml-rl)(m22-rl).  (5.38) 


5.4.2  Proof  of  the  third  form  of  EPnl  for  Tj  =  1/2 


We  showed  in  section  5.2.2  that  the  conjectured  EPnl  (5.14)  is  equivalent  to  a  second 
form  (5.15),  both  of  which  imply  a  third  form  (5.16).  We  have  not  been  able  to  show 
whether  or  not  the  third  form  of  the  EPnl  is  equivalent  to  the  first  two  forms.  In  this 
section,  we  will  prove  (5.16)  for  ij  =  1/2. 

Theorem  5.3  [Giovannetti,  2008]  —  Suppose  that  71-mode  fields,  a=[  a\  <22  •  •  •  an  ] 


4Note  that  with  these  substitutions,  the  condition  rrii  >  77  in  Lemma  5.2  is  automatically 
satisfied,  because  the  symplectic  eigenvalue  of  a  Gaussian  state  must  be  non-negative.  Hence, 
yf{ Na  +  1/2)2  -  |Pa|2  -  I  >  0  =»  v/ Na  +  1/2)2  -  |Pa|2  >  \  >  0. 
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and  b  —  [  bi  &2  •  •  •  bn  ]  in  statistically  independent  states  pa  and  pb,  are  the  in¬ 
puts  to  a  beam  splitter  of  transmissivity  rj  =  1/2,  resulting  in  the  n-rnode  output 
c  =  [  Cl  c2  •  •  •  cn]  such  that  c  =  yj rja  +  a/1  -  rjb.  Then, 

S(pc)  >  ^S(pa)  +  ^S(pb).  (5.40) 


Proof  —  Consider  a  beam  splitter  of  transmissivity  r)  with  two  sets  of  statistically 
independent  n-rnode  fields  a  and  b  as  inputs,  producing  outputs  c  =  sjrjd  +  a/1  —  pb 

and  d  =  y/1  —  rja,  —  y/fjb.  As  the  evolution  from  the  joint  input  state  pab  to  the  joint 

output  state  pcd  is  unitary,  the  total  entropy  remains  unchanged,  i.e. 

S(pcd)  =  S(pab)  (5.41) 

=  S(pa®  pb)  =  S(pa)  +  S(pb),  (5.42) 

where  the  second  equality  follows  from  the  independence  of  a  and  b. 

Lemma  5.4  —  Either  one  of  the  following  must  be  true: 

S(pc)  >  r)S(pa)  +  (1  -  rj)S(pb),  OR  (5.43) 

S(pd)  >  (!  ^v)S(pa)  +  vS(pb)-  (5.44) 


Proof  —  Assume  that  both  (5.43)  and  (5.44)  are  false.  From  subadditivity  of  von 
Neumann  entropy  (see  [6]), 

S(pcd)  <  S(pc)  +  S(pd)  (5.45) 

<  S(pa)  +  S(pb),  (5.46) 

where  the  second  inequality  follows  from  our  assumption  that  both  (5.43)  and  (5.44) 
are  false.  Equations  (5.42)  and  (5.46)  then  imply  S(pcd )  <  S(pab),  which  is  a  contra¬ 
diction. 
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Now,  let  r)  —  1/2.  Using  Lemma  5.4,  either  one  of  the  following  must  be  true: 

S(Pc)  >  OR  (5,47) 

S(f>d)  >  )s(Pa)  +  ^S(pi,).  (5.48) 

But,  for  rj  =  1/2,  the  Wigner  characteristic  functions  of  the  two  output  states  pc  and 
Pd  are  identical,  i.e.,  Xw( 0  =  Xw( 0  =  Xw(C/V^)Xw(C/V^),  and  hence  the  states 
pc  and  pa  are  identical.  Therefore,  S(pc)  =  S(pd )•  It  follows  that,  Eqs.  (5.47)  and 
(5.48)  imply, 

S(pc)  >  ^S(pa)  +  ^S(pb).  (5.49) 

5.5  Monotonicity  of  Quantum  Information 

The  following  result  is  a  straightforward  corollary  of  Theorem  5.3: 

Corollary  5.5  —  Let  a\  and  a 2  be  single-mode  inputs  to  a  50-50  beam  splitter, 
producing  output  mode  b2  =  (hi  +  a2)/v/ 2  in  state  pb2.  If  hi  and  a2  are  in  identical 
states  pa,  then  S(pb2 )  >  S(pa). 

The  classical  version  of  corollary  5.5  was  proved  by  Shannon  [2],  who  showed  that 
if  Y2  =  (Xi  +  X-2)/\/2  is  a  linear  combination  of  two  i.i.d.  random  variables  with 
the  same  distribution  as  a  random  variable  A",  then  77(L2)  >  H( X).  Shannon  also 
proposed  a  general  conjecture  on  the  monotonicity  of  entropy,  which  was  first  proved 
only  very  recently  [71]. 

Corollary  5.5  led  us  to  propose  a  yet  another  conjecture,  on  the  monotonicity  of 
von  Neumann  entropy,  in  analogy  with  Shannon’s  conjecture  on  the  monotonicity  of 
classical  entropy.  The  proof  of  our  monotonicity  conjecture  is  yet  to  be  seen  for  the 
general  case,  even  though  we  have  been  able  to  prove  it  for  some  special  cases.  In 
addition  to  the  ABBN  proof  from  [71],  Shannon’s  monotonicity  conjecture  has  also 
been  proven  by  Tulino  and  Verdu  [72]  and  by  Madiman  and  Barron  [72],  each  one 
using  a  different  technique.  In  proving  Shannon’s  monotonicity  conjecture,  Tulino 
and  Verdu  used  the  same  result  on  the  relationship  between  minimum  mean-squared 
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error  (MMSE)  and  mutual  information  that  Verdu  and  Guo  used  to  proved  the  EPI 
[66].  Hence,  this  suggests  there  might  be  complementary  proofs  for  the  EPnl  and  the 
quantum  version  of  Shannon’s  monotonicity  conjecture  (see  Section  5.5.2  below). 


5.5.1  Shannon’s  conjecture  on  the  monotonicity  of  entropy 

The  following  theorem  is  the  original  form  of  Shannon’s  monotonicity  conjecture: 

Theorem  5.6  [Entropy  increases  at  every  step:  [71,  72,  72]]  —  Let  {Xi,X2, . . .}  be 
i.i.d.  random  variables,  and  let  Yn  be  the  normalized  running-sum  defined  by 

=  Al+X2t--"  +  Xn.  (5.50) 

Vn 

Then,  H(Yn+1)  >  H(Yn ),  Vn  e  {1, 2, . . .}. 

Theorem  5.6  was  proved  first  by  Artstein,  Ball,  Barthe,  and  Naor  in  2004  [71] 
using  relationships  between  Shannon  entropy  and  Fisher  information.  Two  other 
proofs  ([72,  73])  followed  a  few  years  later. 


5.5.2  A  conjecture  on  the  monotonicity  of  quantum  entropy 


In  analogy  to  theorem  5.6,  it  is  natural  to  conjecture  the  following  generalization  of 
corollary  5.5: 

Conjecture  5.7  [von  Neumann  entropy  increases  at  every  step:  Guha,  2008]  —  Let 
{Si,  a2,  •  •  •}  be  independent  modes  in  identical  states  pai  =  pa.  Let  us  define 


b 


n 


0,1  +  cl2  +  •  •  •  +  on 
a Jn 


(5.51) 


Then,  S(pbn+1)  >  (pb J,  Vn  e  {1,  2, . . .}. 

Even  though  we  don’t  have  a  proof  of  the  above  conjecture,  we  have  the  following 
two  pieces  of  evidence  that  support  its  validity. 
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Proof  of  the  monotonicity  conjecture  for  steps  of  powers  of  2 


The  following  theorem  proves  a  slightly  less  general  version  of  the  conjecture.  We  will 
show  that  S(pb2k+1 )  >  S(pb2k)-  Thus,  von  Neumann  entropy  does  increase  monotoni- 
cally  (at  steps  n  =  2k,  \/k )  as  we  mix  in  more  and  more  modes  in  identical  independent 
states,  but  whether  or  not  the  entropy  increases  at  every  step  n  is  not  yet  known. 
Theorem  5.8  [von  Neumann  entropy  increases  at  powers-of-2  steps:  Guha,  2008]  - 
Let  {ai,  d2, . . .}  be  independent  modes  in  identical  states  pai  =  pa.  Let  us  define 


b 


n 


0,1  +  d2  +  •  •  •  +  CLn 

\fn 


(5.52) 


Then,  S(pb2k+1 )  >  S{pbak),  Vfc  e  {0, 1, . . .}. 
Proof  —  Consider 


62^+1 


CL\  ~\~  .  .  .  ^2^+1 

1  i  CL\  +  .  .  .  +  &2k  ^2fc+l  “1“  •  •  •  “h  ^2^+1 

V2  V  V¥  V¥ 

—/=  (b2k  +  &2fcj  ,  Vfc  G  {0, 1,  •  •  •}  , 


(5.53) 

(5.54) 

(5.55) 


where  we  define  =  a,2k+^+^a^+1 .  \ s  ^}ie  q?;’s  are  mutually  independent  and  are  in 
identical  states  pa ,  therefore  b2k  and  b'2k  must  be  in  independent  identical  states,  pb2k- 
The  proof  now  follows  from  applying  corollary  5.5  to  the  modes  b2k  and  b'2k  mixing 
on  a  50-50  beam  splitter  to  produce  b2k+i . 


The  quantum  central  limit  theorem 

An  important  conequence  of  Shannon’s  monotonicity  result  (Theorem  5.6  above)  is 
that  the  convergence  in  the  central  limit  theorem  is  monotonic.  The  Central  Limit 
Theorem  (CLT)  states  that: 

Theorem  5.9  [Central  Limit  Theorem  (CLT)]  —  Let  {Xi,X2,. . .}  be  independent 
identically  distributed  copies  of  a  zero-mean  random  variable  X  with  variance  cr\, 
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and  let  Yn  be  the  normalized  running-sum  defined  by 


^  Xi  +  X2  +  ...  +  Xn  ( 

Yn  = - 1= - •  (5.56) 

Vn 

Then,  Yn  converges  in  distribution  to  a  zero-mean  Gaussian  random  variable  Xq  with 
variance  Var^c)  —  o~x,  as  n  — >  oo.  Hence,  limn_^,0O  H(Yn )  =  H(Xq)  =  |  ln(27recr^). 
The  monotonicity  result  (Theorem  5.6)  proves  that  H(Yn )  increases  monotonically 
as  n  increases,  but  the  CLT  (Theorem  5.7)  says  that  H(Yn )  converges  as  n  increases 
without  bound,  and  converges  to  the  Gaussian  random  variable  with  the  same  vari¬ 
ance  as  X. 


In  the  quantum  case,  we  have  yet  to  prove  our  conjectured  monotonicity  result 
(Conjecture  5.7).  However  we  can  prove  that  von  Neumann  entropy  is  monotonic 
in  n.  for  n  G  {l,  2, 4, . . . ,  2k, . . .}  (Theorem  5.8).  We  will  show  below  that  the  von 
Neumann  entropy  S(pb2k )  in  Theorem  5.8  also  converges  as  n  =  2k  increases  without 
bound  -  like  the  Shannon  entropy  in  the  classical  case  -  and  converges  to  the  von 
Neumann  entropy  of  a  single-mode  zero-mean  Gaussian  state  with  the  same  second 
order  moments  as  the  zero-mean  single-mode  state  pa ■  To  state  it  more  precisely: 


Theorem  5.10  [Quantum  Central  Limit  Theorem  (QCLT):  Shapiro,  2008]  —  Let 
{ai,a2, ...}  be  independent  modes  in  identical  zero-mean  states  pai  =  pa.  Let  us 
define 


bn  = 


T  T  -  -  -  T  br 


n 


(5.57) 


Then,  the  state  p})n  converges  to  the  single-mode  zero-mean  Gaussian  state  pc  with  co- 


variance  matrix  Kfl(:,  =  Ka  as  n 


oo.  Hence,  lim n^S(pbtl)  =  S{pG)  ^  g(\/Wte 1 


1/2). 


Proof  —  From  the  independence  of  the  modes  a*,  1  <  i  <  n,  we  have 


(5.58) 


Expressing  the  Wigner  characteristic  functions  in  terms  of  the  real  and  imaginary 
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parts  of  C  =  Ci  +  .7(2;  we  have 


In 


Xw(C) 


=  n  In 
=  n  In 


Xw 


(exp 


-2X162  ,  2X261 V 

'  r~  I  /  pa 

V™  J  . 


n 


(5.59) 

(5.60) 


Note  that  Xwr(0,0)  =  1  and  that  we  are  given  (a)  =  0.  For  a  function  f(x,y),  such 
that  /( 0,0)  =  1,  we  have  the  following  Taylor  series  expansion  for  In (f(x,y))  around 
(x,y)  =  (0,0): 

In (f(x,  y))  =  xfx( 0, 0)  +  yfy( 0,  0)  +  ^  [x2(fxx( 0,  0)  -  fx( 0,  0)2) 

+xy(fxy( 0, 0)  -  4(0,  0)/,(0,  0)  +  fyx{ 0, 0))  +  y2(fyy( 0,  0)  -  4(0,  0)2)] 

+h.o.t.,  (5.61) 


where  using  which  we  expand  In  Xw'  (C) 


=  n  In 


-yPa  (  _C_ 
Xw  Vn 


around  ((1,(2)  =  (0,0) 


by  evaluating  all  the  first  and  second  order  partial  derivatives  of  Xw( Ci ,  C2)  -  We 
obtain  the  following: 


In 


Xw(0 


=  n 


Ci  V2  +  CfW  —  X4Wh2\  (  f 

n  J  °  \  n3/2 


(5.62) 


which  implies  that 


Xw  (C)  =  exp 


-2  (C2V2  +  C2W  -  2C1C2W2)  +  o 


■n>!  2 


(5.63) 


Hence  in  the  limit  n  — >  00,  (C)  is  identical  to  the  Wigner  characteristic  function  of 

a  Gaussian  state  whose  covariance  matrix  equals  that  of  the  state  pa  (see  Appendix  A). 


It  can  be  shown  that  for  a  state  pa  with  covariance  matrix  Ka,  the  von  Neumann 
entropy  S(pa )  is  maximum  when  pa  is  Gaussian.  Thus,  the  proof  of  the  Monotonic¬ 
ity  Conjecture  for  n  =  2k  (Theorem  5.8)  along  with  the  Quantum  Central  Limit 
Theorem  (Theorem  5.10)  suggest  that  the  entropy  S  (pb„)  increases  monotonically  as 
n  increases,  and  converges  to  the  entropy  of  the  Gaussian  state  pc  with  covariance 
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matrix  that  is  the  same  as  that  of  pa,  i.e.  lim. 


S  (Pbn)  =  9 


1 

2 
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Chapter  6 


Conclusions  and  Future  Work 


In  this  chapter,  we  summarize  the  accomplishments  of  the  thesis,  and  make  sugges¬ 
tions  for  future  work. 


6.1  Summary 

Classical  information  theory  was  born  with  Claude  Shannon’s  seminal  1948  paper  [2] , 
in  which  he  derived  the  ultimate  limits  to  data  rates  at  which  reliable  communications 
can  be  achieved  over  a  channel.  It  took  almost  half  a  century  of  painstaking  research 
to  come  up  with  error-correcting  codes  that  actually  approach  operating  near  the 
Shannon  bound  [74],  The  past  40  years  have  also  witnessed  tremendous  growth  in 
the  complexity  and  power  of  digital  computing,  and  with  the  advent  of  nanoscale 
technologies  modern-day  digital  computing  chips  are  coming  close  to  reaching  their 
physical  limits  imposed  by  quantum  mechanics.  The  advent  of  Shor’s  factoring  al¬ 
gorithm  [75]  and  some  other  quantum  algorithms  that  were  discovered  in  the  past 
decade,  has  shown  us  that  the  interesting  though  somewhat  counter-intuitive  impli¬ 
cations  of  the  quantum  nature  of  matter  can  be  potentially  used  to  our  advantage 
in  performing  computing  and  communications  tasks,  and  can  solve  some  problems 
efficiently  that  have  no  known  efficient  classical  solutions. 

The  primary  motivation  behind  this  thesis  derives  from  the  overwhelming  interest 
in  today’s  communications  and  information  theory  communities  in  pursuing  the  quan- 
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turn  parallel  of  the  half  a  century  of  work  on  information  theory,  error-control  coding 
and  the  theory  of  digital  communications  that  began  with  Shannon’s  work.  Quan¬ 
tum  information  science  has  seen  several  advances  in  the  past  decade,  and  we  already 
understand  fairly  well  the  information  theory  behind  sending  classical  data  reliably 
over  point-to-point  quantum  communication  channels,  i.e.,  encoding  classical  data  by 
modulating  the  quantum  states  of  carrier  particles  of  the  medium.  What  is  less  well 
understood  is  the  information  theory  behind  sending  classical  data  in  multiple-user 
settings,  over  point-to-point  quantum  channels  with  feedback,  over  fading  channels, 
over  channels  in  which  the  transmitter  and  receiver  have  multiple  antennas,  sending 
quantum  data  reliably  over  quantum  channels,  etc.  Peter  Shor  and  Seth  Lloyd  have 
shown  that  the  maximum  of  a  quantity  called  coherent  information  of  a  channel  is  the 
maximum  achievable  data  rate,  in  qubits  per  channel  use,  at  which  quantum  informa¬ 
tion  can  be  transmitted  reliably  over  a  quantum  channel  by  appropriately  encoding 
and  decoding  the  quantum  information  [76,  77]. 

The  performance  of  communication  systems  that  use  electromagnetic  waves  to 
carry  the  information  are  ultimately  limited  by  noise  of  quantum-mechanical  ori¬ 
gin.  At  optical  frequencies  the  quantum-mechanical  effects  are  fairly  pronounced  and 
perceivable,  and  shot-noise-limited  semiclassical  photo-detection  theory  falls  short  of 
explaining  the  measurement  statistics  obtained  by  standard  optical  receivers  detect¬ 
ing  non-classical  states  of  light.  Thus,  determining  the  ultimate  classical  information 
carrying  capacity  of  optical  communication  channels  requires  quantum-mechanical 
analysis  to  properly  account  for  the  bosonic  nature  of  optical  waves.  Recent  research 
by  several  theorists  in  our  group  and  by  several  others,  has  established  capacity 
theorems  for  point-to-point  bosonic  channels  with  additive  thermal  noise,  under  the 
presumption  of  a  minimum  output  entropy  conjecture  for  such  channels  [55].  Towards 
the  beginning  of  this  thesis,  we  drew  upon  our  work  on  the  capacity  of  the  point- 
to-point  lossy  bosonic  channel  to  evaluate  the  optimum  capacity  of  the  free-spaee 
line-of-sight  optical  communication  channel  with  Gaussian-attenuation  transmit  and 
receive  apertures.  Optimal  power  allocation  across  all  the  spatio-temporal  modes  was 
studied,  in  the  far  and  near-held  propagation  regimes.  We  also  compared  and  estab- 
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lished  the  gap  between  the  ultimate  capacity  and  date  rates  that  can  be  achieved  by 
using  classical  encoding  states  and  structured  receiver  measurements. 

The  latter  part  of  this  the  was  an  attempt  to  further  the  pursuit  of  the  ultimate 
classical  information  capacity  of  bosonic  channels,  albeit  in  the  multiple-user  setting; 
particularly  for  the  case  in  which  one  transmitter  sends  independent  streams  of  bits 
to  more  than  one  receiver,  viz.,  the  broadcast  channel.  We  drew  upon  recent  work 
on  the  capacity  region  of  two-user  degraded  quantum  broadcast  channels  to  establish 
ultimate  capacity-region  theorems  for  the  bosonic  broadcast  channel,  under  the  pre¬ 
sumption  of  another  conjecture  on  the  minimum  output  entropy  of  bosonic  channels. 
We  also  generalized  the  degraded  broadcast  channel  capacity  theorem  to  the  case  of 
more  than  two  receivers,  and  we  proved  that  if  the  above  conjecture  is  true,  the  rate 
region  achievable  using  a  coherent-state  encoding  with  optimal  joint- detection  mea¬ 
surement  at  the  receivers  would  in  fact  be  the  ultimate  capacity  region  of  the  bosonic 
broadcast  channel  with  additive  thermal  noise  and  loss,  and  with  an  arbitrary  number 
of  receivers.  In  an  attempt  to  the  prove  the  minimum  output  entropy  conjectures,  we 
realized  that  these  conjectures,  restated  for  the  Wehrl-entropy  measure  instead  of  von 
Neumann  entropy,  could  all  be  shown  to  be  immediate  consequences  of  the  entropy 
power  inequality  (EPI)  -  a  very  well  known  inequality  in  classical  information  the¬ 
ory,  primarily  used  in  proving  coding-theorem  converses  for  Gaussian  channels.  The 
upshot  of  the  equivalence  established  between  the  EPI  and  the  Wehrl-entropy  con¬ 
jectures,  was  our  realization  that  an  EPI-like  inequality,  restated  in  terms  of  the  von 
Neumann  entropy  measure,  would  imply  all  the  minimum  output  entropy  conjectures 
that  lie  at  the  heart  of  several  capacity  results  for  bosonic  communication  channels. 
We  therefore  conjectured  the  entropy  photon-number  inequality  (EPnl)  in  analogy 
with  the  EPI,  that  connects  von  Neumann  entropies  and  mean  photon-numbers  of 
states  of  bosonic  modes  that  linearly  interact  with  one  another.  We  showed  that  the 
minimum  output  entropy  conjectures  can  be  derived  as  special  cases  of  the  EPnl.  We 
conjectured  two  forms  of  the  EPnl  that  we  proved  to  be  equivalent  to  each  other. 
We  also  conjectured  a  third  form  of  the  EPnl  in  analogy  with  the  EPI,  which  the 
former  two  forms  can  be  readily  shown  to  imply,  but  we  have  not  been  able  to  show 
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the  converse.  We  proved  the  EPnl  under  a  product-Gaussian-state  restriction,  and 
proved  the  third  form  of  the  EPnl  for  the  special  case  in  which  the  input  states  mix 
in  equal  proportions  (i.e.  77  =  1/2).  This  proof  of  the  third  form  of  EPnl  for  77  =  1/2 
instigated  investigation  into  the  monotonicity  properties  of  information,  which  is  -  in 
its  classical  form  -  very  closely  tied  with  the  EPI.  In  analogy  with  an  old  conjecture 
by  Shannon,  on  the  monotonicity  of  Shannon  entropy  of  the  sum  of  i.i.d.  random  vari¬ 
ables,  we  proposed  a  quantum  version  of  the  monotonicity  conjecture.  We  proved  the 
conjecture  but  only  for  the  special  case  in  which  the  number  of  independent  modes 
in  the  mixture  increment  as  powers  of  2,  i.e.  n  =  2k.  We  also  proved  a  quantum 
version  of  the  central  limit  theorem  which  along  with  the  proof  of  the  monotonicity 
conjecture  for  n  =  2k  provides  strong  evidence  in  favor  of  the  quantum  version  of  the 
monotonicity  conjecture. 

6.2  Future  work 

In  what  follows,  we  describe  some  of  the  primary  open  problems  in  line  with  the 
research  done  in  this  thesis. 

6.2.1  Bosonic  fading  channels 

In  realistic  unguided-propagation  scenarios,  transmission  loss  in  the  propagation 
medium  is  frequency-dependent,  time-varying  and  is  of  probabilistic  nature.  Our 
work  on  the  capacity  of  wideband  free-space  optical  channels  in  Chapter  2  takes  into 
consideration  only  diffraction-limited  propagation  and  additive  ambient  noise  from 
a  thermal  environment.  Atmospheric  optical  transmission  suffers  from  a  variety  of 
other  propagation  problems,  many  of  which  are  time- varying  and  random,  e.g.,  the 
fading  that  arises  from  the  refractive-index  fluctuations  known  as  atmospheric  tur¬ 
bulence.  Drawing  on  our  work  on  the  lossy  bosonic  channel  with  fixed  transmission 
loss,  an  outage-capacity  model  can  be  set  up  for  the  slow-fading  bosonic  channel,  i.e., 
in  the  case  in  which  the  transmissivity  changes  slowly  over  time  in  comparison  to  the 
data  rate.  Contrary  to  the  case  of  fixed  transmission  loss,  there  is  no  transmission 
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rate  R,  for  the  fading  channel  for  which  the  probability  of  error  can  be  driven  down 
arbitrarily  close  to  zero.  So,  in  the  strict  sense,  the  capacity  of  the  slow-fading  chan¬ 
nel  is  zero.  An  e-outage  capacity  is  the  maximum  rate  at  which  one  can  transmit 
data  reliably  over  the  channel  successfully,  on  at  least  a  1  —  e  fraction  of  the  total 
number  of  large  blocks  of  channel  uses  in  which  transmission  is  attempted.  For  the 
fast-fading  case,  similar  to  the  classical  scenario,  it  is  not  unreasonable  to  suspect 
that  it  will  be  meaningful  to  assign  a  positive  capacity  to  the  channel  in  the  usual 
sense,  in  the  limit  that  codewords  have  a  block-length  that  is  much  longer  than  the 
coherence  time  of  the  fade.  The  way  one  would  find  the  fast-fading  capacity,  say,  for 
the  lossy  bosonic  channel  using  coherent-state  inputs  under  a  mean  photon  number 
constraint  of  N  photons  per  mode  at  the  input,  would  be  by  maximizing  the  Holevo 
quantity 

Cfast— fade— coh  =  max  _x(p(a),  /  /  Pv{x)\y/xa)  {y/xa\dxd?  a)  ,  (6.1) 

where  x(p(a)i  Pa)  —  S(^2ap(a)pa)  ~'$2ap(oi)S(pa)  is  the  Holevo  information  for  the 
ensemble  {p(a),  pa},  S(p)  =  —  Tr(p  log  p)  is  the  von  Neumann  entropy  of  the  quantum 
state  p,  and  pv{x)  is  the  probability  distribution  of  the  fast-fading  transmissivity 
parameter  rj  of  the  channel.  Even  though  the  above  is  an  achievable  rate  using 
coherent  (classical)  states,  for  a  realistic  fading  model  such  as  Rayleigh  or  Rician 
fading,  whether  or  not  there  would  be  any  capacity  advantage  by  using  non-classical 
states  for  encoding,  is  yet  to  be  answered. 

6.2.2  The  bosonic  multiple-acess  channel  (MAC) 

It  was  shown  by  Yen  and  Shapiro  in  [11]  that  coherent  states  achieve  the  sum-rate 
capacity  for  the  bosonic  MAC  with  two  transmitters  and  one  receiver.  It  was  also 
shown  that  at  the  two  corners  of  the  capacity  region  of  the  two-user  MAC  (i.e. ,  when 
the  transmission  rate  for  one  of  the  two  transmitters  is  zero),  using  non-classical 
(squeezed)  states  yields  substantial  rate-benefit  over  using  classical  (coherent)  states 
for  encoding.  Finding  the  best  achievable  rate  region  for  the  bosonic  MAC  for  two  or 
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more  users,  and  the  best  encoding  states  and  measurement  that  would  achieve  that 
capacity,  is  still  an  open  problem. 

6.2.3  Multiple-input  multiple-output  (MIMO)  or  multiple- 
antenna  channels 

Under  the  presumption  of  a  minimum  output  entropy  conjecture,  we  found  in  this 
thesis  the  ultimate  capacity  region  for  the  bosonic  broadcast  channel  with  additive 
thermal  noise,  and  an  arbitrary  number  of  receivers.  The  degraded  nature  of  the 
bosonic  broadcast  channel  is  instrumental  in  finding  the  capacity  region,  using  ex¬ 
tensions  of  known  results  on  degraded  quantum  broadcast  channels  [52]  to  infinite 
dimensional  Hilbert  spaces.  Multiple  Input  Multiple  Output  (MIMO)  channels  are 
those  in  which  each  transmitter  and  receiver  may  have  more  than  one  antenna.  A 
MIMO  channel  can  be  a  point-to-point,  multiple-access,  or  a  broadcast  channel  based 
on  how  many  physical  transmitters  and  receivers  it  has.  The  famous  classical  exam¬ 
ple  of  a  degraded  broadcast  channel  is  the  Gaussian-noise  broadcast  channel,  whose 
capacity  region  was  found  by  Bergmans  [49] .  The  capacity  region  of  the  MIMO  Gaus¬ 
sian  broadcast  channel,  however,,  was  a  long-standing  open  problem  because  of  the 
non-degraded  nature  of  the  MIMO  Gaussian  channel.  Very  recently,  the  capacity  of 
the  MIMO  additive-Gaussian-noise  broadcast  channel  was  found  by  Weingarten  et. 
al.  [78].  Finding  the  classical  capacity  region  for  the  general  bosonic  MIMO  broadcast 
channel  remains  an  open  problem. 

6.2.4  The  Entropy  photon-number  inequality  (EPnl)  and  its 
consequences 

The  Entropy  Power  Inequality  (EPI)  from  classical  information  theory  is  widely  used 
in  coding  theorem  converse  proofs  for  Gaussian  channels.  By  analogy  with  the  EPI, 
we  conjectured  in  this  thesis  a  quantum  version  of  the  EPI,  which  we  call  the  En¬ 
tropy  Photon- number  Inequality  (EPnl).  We  showed  that  the  three  minimum  output 
entropy  conjectures  cited  in  Chapter  4  are  simple  corollaries  of  the  EPnl.  Hence,  prov- 
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ing  the  EPnl  would  immediately  establish  key  results  for  the  capacities  of  bosonic 
communication  channels,  including  (i)  the  classical  capacity  of  the  single-user  lossy 
bosonic  channel  with  additive  thermal  noise,  (ii)  the  classical  capacity  region  of  the 
general  multiple-receiver  bosonic  broadcast  channel,  -  and  thanks  to  recent  work  by 
Graeme  Smith  on  privacy  capacity  of  degradable  channels  [60]  -  (iii)  the  privacy  ca¬ 
pacity  of  the  bosonic  wiretap  channel,  and  (iv)  the  ultimate  quantum  capacity  of  the 
lossy  bosonic  channel1. 

Even  though  the  EPnl’s  being  a  stronger  conjecture  might  make  it  harder  to  prove 
than  the  less  powerful  minimum  output  entropy  conjectures,  the  huge  literature  on 
various  wave  to  prove  the  EPI  may  potentially  help  in  trying  to  prove  the  EPnl.  For 
example,  proving  the  EPnl  for  integer-ordered  Renyi  entropy  might  be  a  good  first 
step  as  the  Renyi  entropy  is  simpler  to  deal  with  analytically  than  the  von  Neumann 
entropy. 


6.3  Outlook  for  the  Future 

The  ultimate  aim  of  research  on  information  theory  for  bosonic  channels  is  to  char¬ 
acterize  completely  the  ultimate  rate-limits  of  communications  over  the  most  general 
quantum  network.  In  particular,  this  goal  entails  developing  a  complete  theory  of 
continuous- variable  communications,  error-correction  and  cryptography  (for  instance, 
CV  quantum  key  distribution)  for  transmission  of  information  over  quantum  optical 
channels,  at  rates  approaching  the  ultimate  information  theoretic  limits.  Toward  that 
end  we  need  to  develop  a  theoretical  framework  with  which  we  might  be  able  to  port 
known  robust  block  and  convolutional  qubit  error- correct  ing  codes  (and  design  new 
codes)  for  bosonic  channels  where  the  quantum  state  of  every  field  mode  lives  in  an 
infinite  dimensional  Hilbert  space,  as  opposed  to  qubit  spaces  for  which  the  theory 
of  quantum  error-correcting  codes  (QECC)  has  been  built.  In  classical  communica¬ 
tions,  by  sampling  and  quantizing  band-limited  signals,  it  is  possible  to  use  bit-error 

1The  ultimate  quantum  capacity  of  the  lossy  bosonic  channel  has  been  found  by  Wolf.  et.  al.  by 
a  technique  that  doesn’t  make  use  of  any  unproven  conjecture.  Wolf’s  capacity  result  agrees  with 
ours  and  hence  lends  more  evidence  to  the  truth  of  the  second  minimum  output  entropy  conjecture. 


159 


correcting  block  and  convolutional  codes  on  analog  continuous-time  channels,  such  as 
the  band-limited  additive  white  Gaussian  noise  (AWGN)  channel.  Plots  of  symbol- 
error  probability  versus  channel  signal-to-noise  ratio  (SNR)  quantify  the  performance 
of  specific  codes  over  a  given  channel,  in  terms  of  the  distance  from  the  theoretical 
bound  imposed  by  Shannon.  For  instance,  state-of-the-art  turbo  codes  [74]  with  soft- 
input  soft-output  (SISO)  iterative  decoding  are  known  to  perform  within  0.1  dB  of 
the  Shannon  bound  at  a  probability  of  symbol  error  of  10“5.  It  would  be  nice  to 
be  able  to  make  a  similar  statement  about  the  performance  of,  say,  a  quantum  con¬ 
volutional  code  (QCC)  over  a  lossy  bosonic  channel  with  additive  thermal  noise  for 
transmission  of  quantum  information,  e.g.,“The  fidelity  of  decoding  a  certain  QCC 
over  a  lossy  thermal  noise  channel  increases  as  a  function  of  the  channel  SNR,  and 
is  within  0.1  dB  of  the  theoretical  bound  set  by  the  quantum  coherent  information”. 
Continuous- variable  quantum  key  distribution  is  a  topic  on  which  a  great  deal  of  work 
has  been  done  recently  [79],  but  more  work  is  still  needed  to  find  the  best  secret  key 
rates,  and  the  optimal  protocols  to  achieve  those  rates  over  bosonic  channels.  Some 
work  has  been  done  by  Gottesman,  Kitaev,  and  Preskill  [80]  on  encoding  qubit  states 
into  continuous  variable  held  modes. 

Quantum  information  processing  has  seen  a  huge  surge  of  interest  in  the  past 
decade,  largely  in  academia  but  increasingly  in  industry.  Whereas  making  a  quan¬ 
tum  computer  crack  a  128-bit  RSA  encryption  code  using  Shor’s  algorithm  is  still 
a  distant  dream,  obtaining  better  data  rates  over  lasercom  channels  for  terrestrial 
and  deep-space  applications  using  quantum  modulation  and  detection  schemes,  or 
obtaining  progressively  more  secure  communications  using  reliable  quantum  key  dis¬ 
tribution  (QKD)  systems  over  existing  optical  channels  with  novel  encoding  schemes 
and  quantum  measurement,  seem  a  lot  more  realizable  in  a  relatively  short  time 
frame. 
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Appendix  A 


Preliminaries 


This  appendix  will  provide  a  brief  background  on  quantum  mechanics,  quantum  op¬ 
tics,  and  quantum  information  theory  that  will  be  useful  in  reading  this  thesis. 

A.l  Quantum  mechanics:  states,  evolution,  and 
measurement 

It  was  found  in  the  early  1900s  by  Max  Planck  that  the  energy  of  electromagnetic 
waves  must  be  described  as  consisting  of  small  packets  of  energy  or  ‘quanta’  in  order 
to  explain  the  spectrum  of  black-body  radiation.  He  postulated  that  a  radiating  body 
consisted  of  an  enormous  number  of  elementary  electronic  oscillators,  some  vibrating 
at  one  frequency  and  some  at  another,  with  all  frequencies  from  zero  to  infinity  being 
represented.  The  energy  E  of  any  one  oscillator  was  not  permitted  to  take  on  any 
arbitrary  value,  but  was  proportional  to  some  integral  multiple  of  the  frequency  /  of 
the  oscillator,  i.e.,  E  =  hf,  where  h  =  6.626  x  10~34  Joule  seconds  is  the  Planck’s 
constant.  In  1905,  Albert  Einstein  used  Planck’s  constant  to  explain  the  photoelectric 
effect  by  postulating  that  the  energy  in  a  beam  of  light  occurs  in  concentrations  that 
he  called  light  quanta,  that  later  on  came  to  be  known  as  photons.  This  led  to  a 
theory  that  established  a  duality  between  subatomic  particles  and  electromagnetic 
waves  in  which  particles  and  waves  were  neither  one  nor  the  other,  but  had  certain 
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properties  of  both. 

The  foundations  of  quantum  mechanics  date  from  the  early  1800s,  but  the  real 
beginnings  of  modern  quantum  mechanics  date  from  the  work  of  Max  Planck  in 
the  1900s.  The  term  “quantum  mechanics”  was  first  coined  by  Max  Born  in  1924. 
The  acceptance  of  quantum  mechanics  by  the  general  physics  community  is  due  to 
its  accurate  prediction  of  the  physical  behavior  of  systems,  particularly  of  systems 
showing  previously  unexplained  phenomena  in  which  Newtonian  mechanics  fails,  such 
as  the  black  body  radiation,  photoelectric  effect,  and  stable  electron  orbits.  Most 
of  classical  physics  is  now  recognized  to  be  composed  of  special  cases  of  quantum 
mechanics  and/or  relativity  theory.  Paul  Dirac  brought  relativity  theory  to  bear  on 
quantum  physics,  so  that  it  could  properly  deal  with  events  that  occur  at  a  substantial 
fraction  of  the  speed  of  light.  Classical  physics,  however,  also  deals  with  gravitational 
forces,  and  no  one  has  yet  been  able  to  bring  gravity  into  a  unified  theory  with  the 
relativized  quantum  theory. 

We  will  provide  below  a  very  brief  account  on  the  mathematical  formulation  of 
quantum  mechanics,  that  will  be  a  useful  foundation  for  the  material  covered  in  this 
thesis.  For  detailed  study  of  quantum  mechanics,  the  reader  is  referred  to  one  of  the 
many  popular  texts  on  the  subject,  such  as  [81]  and  [82], 

A.  1.1  Pure  and  mixed  states 

A  pure  state  in  quantum  mechanics  is  the  entirety  of  information  that  may  be  known 
about  a  physical  system.  Mathematically,  a  pure  state  is  a  unit  length  vector,  \^) 
(known  as  a  ‘kef  in  Dirac  notation)  that  lives  in  a  complex  Hilbert  space  7i  of 
possible  states  for  that  system.  Expressed  in  terms  of  a  set  of  complete  basis  vectors 
(|0n)}  ^  Tf,  |^)  =  Y,n  cn\(Pn)  becomes  a  column  vector  of  (a  possibly  infinite)  set 
of  complex  numbers  cn,  where  ]T)n  |cn|2  =  1.  With  each  pure  state  |^)  we  associate 
its  Hermitian  conjugate  vector  (known  as  a  ‘bra’)  (ip\,  which  is  a  row  vector  when 
expressed  in  a  basis  of  7i.  The  simplest  example  of  a  pure  state  is  the  state  of  a 
two-level  system  also  known  as  a  ‘qubit’,  which  is  the  fundamental  unit  of  quantum 
information,  in  analogy  with  a  ‘bit’  of  classical  information.  A  qubit  lives  in  the  two- 
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dimensional  complex  vector  space  C2  spanned  by  two  orthonormal  vectors  |0)  and 
1 1) ,  and  can  be  expressed  as  |^)  =  a|0)  +  1 1) ,  where  a,  (3  E  C,  and  |or|2  +  \/3\2  =  1. 

A  mixed  state  in  quantum  mechanics  represents  classical  (statistical)  uncertainty 
about  a  physical  system.  Mathematically,  a  mixed  state  is  represented  by  a  ‘density 
matrix’  (or  a  density  operator)  p,  which  is  a  positive  definite,  unit-trace  operator  in 
TL.  The  canonical  form  of  a  density  matrix  is 

P  =^2pk\^k){^k\,  (A.l) 

k 

for  any  collection  of  pure  states  {| ipk)},  and  XlfcPfc  =  1-  The  mixed  state  p  can  be 
thought  of  as  a  statistical  mixture  of  pure  states  \ipk),  where  the  projection  \'ipk)('lPk\ 
is  the  density  operator  for  the  pure  state  | Vtfc),  though  it  is  worth  pointing  out  that 
the  decomposition  of  a  mixed  state  p  as  a  mixture  of  pure  states  (A.l)  is  by  no  means 
unique.  As  we  know,  a  positive  definite  operator  p  must  have  a  spectral  decomposition 
p  =  JA  Aj|Aj)(Aj|,  in  terms  of  the  eigenkets  |A*),  with  the  unit-trace  condition  on  p 
requiring  that  the  eigenvalues  A i  must  form  a  probability  distribution. 


A.  1.2  Composite  quantum  systems 

We  shall  henceforth  use  symbols  such  as  A,  B,  C  to  refer  to  quantum  systems,  with  Ha 
referring  to  the  Hilbert  space  whose  unit  vectors  are  the  pure  states  of  the  quantum 
system  A.  Given  two  systems  A  and  B,  the  pure  states  of  the  composite  system 
AB  correspond  to  unit  vectors  in  Hab  =  At .4  ®  Hb-  We  use  superscripts  on  pure 
state  vectors  and  density  matrices  to  identify  the  quantum  system  with  which  they 
are  associated.  For  a  multipartite  density  matrix  pABC ,  we  use  the  notation  pAB  = 
Tr cPABC  to  denote  the  partial  trace  over  one  of  the  constituent  quantum  systems. 

Let  {1 0m)"4}  and  {|0n)B}  represent  sets  of  basis  vectors  for  the  state  spaces  Ha 
and  Hb  of  quantum  systems  A  and  B  respectively.  Pure  states  \i^)AB  and  mixed  states 
pAB  of  the  composite  system  AB  are  defined  similarly  as  above  with  an  underlying 
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set  of  basis  vectors  \4>mn)AB  =  \4>m)A  <S>  \(pn)B  G  Hab,  viz-, 


\fp)AB  =  J^cmn \(j)m.n)AB,  with  | cmn | 2  =  1,  and  (A. 2) 

mn  mn 

pAB  =  y  Pfc|^fc)A£M8(^fc|;  with  pfc  >  0,  y, pk  =  1,  (A. 3) 

Ai  k 

for  pure  states  |^fc)AB  G  7Ya_b- 

A  Pure  state  G  TYab  of  a  composite  system  Ah?  can  be  classified  into: 

1.  A  product  state  —  when  | 'ip)AB  can  be  decomposed  into  a  tensor  product  of  two 
pure  states  in  A  and  B ,  i.e.  \^)AB  =  \^)A  ®  \^)B ■ 

2.  An  entangled  state  —  when  \^)AB  cannot  be  expressed  as  a  tensor  product  of 
two  pure  states  in  A  and  B  (for  instance,  the  state  (|0)|0)  +  \l)\l))/y/2  is  a  pure 
entangled  state  of  a  two-qubit  system).1 

A  mixed  state  pAB  e  B(7{ab)  of  a  composite  system2  AB  can  be  classified  into: 

1.  A  product  state  —  when  pAB  can  be  decomposed  into  a  tensor  product  of  two 
states  in  A  and  B ,  i.e.  pAB  =  pA  ®  pB ,  with  at  least  one  of  pA  or  pB  being  a 
mixed  state. 

2.  A  classically-correlated  state  —  when  pAB  is  not  a  product  state,  but  can  be 
expressed  nevertheless  as  a  statistical  mixture  of  product  pure  states  of  the 
systems  A  and  B,  i.e.  pAB  =  Y.kPk{\oik)A  ®  \f3k)B)(A{ak\  ®  B(/3k |),  for  any  set 
of  pure  states  \ak)  €  Ha  and  \j3k)  G  Hb,  with  pk  >  0  and  J2kPk  =  1- 

3.  An  entangled  state  —  when  pAB  is  a  mixed  state  of  the  composite  system  AB 
which  is  neither  a  product  state  nor  a  classically-correlated  state,  i.e.  the  joint 
state  of  the  composite  system  has  a  correlation  between  the  systems  A  and  B 

1  Entanglement  is  inherently  a  quantum-mechanical  property  of  composite  physical  systems  and 
is  stronger  than  any  probabilistic  correlation  between  the  constituent  systems  that  classical  physics 
might  permit.  The  individual  states  of  the  systems  A  and  B ,  when  their  joint  state  is  pure  and 
entangled,  are  mixed  states,  which  are  obtained  by  taking  a  partial  trace  over  the  other  system,  i.e. 
pA  =  Tr B(pAB)  =  TrB(\ip)ABAB{ip\)  =  B (fn IA4B|bn)'B>  and  vice  versa. 

2B{Ti)  is  the  set  of  all  bounded  operators  in  H. 
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which  is  stronger  than  any  (classical)  probabilistic  correlation.  For  instance, 
consider  equal  mixtures  of  the  Bell  states  |a)  =  (|0)|0)  +  \l)\l))/\f2  and  \/3)  = 
(|1)|0)  +  |0)|1))/a/2.  This  is  a  mixed  entangled  state,  (|a)(a|  +  |/3)(/3|)/2,  of  a 
two-qubit  system.3 


A.  1.3  Evolution 

The  time  evolution  of  a  closed  system  is  defined  in  terms  of  the  unitary  time- 
evolution  operator  U(t,t0 )  =  exp (—iH(t  —  t0)/h),  where  H  is  the  time-independent 
Hamiltonian  of  the  closed  system.  The  evolution  of  the  system  when  it  is  in  a  pure 
state  \^{t0))  at  time  f0,  and  when  it  is  in  a  mixed  state  p(t0)  at  time  t0  are  respectively 
given  by: 


=  *7(MoM*o)),  and  (A.4) 

pit)  =  U(t,t0)p{t0)U^{t,t0).  (A.5) 


The  time  evolution  of  a  general  open  system,  i.e.  a  system  that  interacts  with 
an  environment  is  not  a  unitary  evolution  in  general.  The  joint  state  of  the  system 
and  the  environment  is  a  closed  system  and  hence  must  follow  a  unitary  evolution  as 
stated  above.  But  when  we  look  at  the  evolution  of  the  state  of  the  system  alone,  it  is 
non-unitary  and  is  represented  by  what  we  call  a  trace-preserving,  completely-positive 
(TPCP)  map.  All  quantum  channels  that  we  study  in  this  thesis  are  TPCP  maps 
in  general.  A  TPCP  map  S  takes  density  operator  pin  e  B(7tin)  to  density  operator 
Pout  £  B{7im it),  and  must  satisfy  the  following  properties: 


(i)  S  preserves  the  trace,  i.e.,  Tr(£(p))  =  1  for  any  pm  G  B{TCin). 

3We  reiterate  that  if  a  mixed  state  pAB  is  not  decomposable  into  a  tensor  product  of  mixed 
states,  i.e.  pAB  ^  pA  ®  pB ,  the  joint  state  pAB  is  NOT  necessarily  entangled,  and  it  could  just 
have  classical  correlations  between  the  two  constituent  systems.  There  has  been  a  long  ongoing 
debate  about  whether  the  experimentally  demonstrated  enhancement  in  imaging  characteristics  of 
optical  coherence  tomography  (OCT)  systems  using  the  entangled  bi-photon  state  generated  by 
spontaneous  parametric  downconversion  (SPDC),  should  really  be  attributed  to  the  entanglement 
property  of  the  photon  pairs.  It  has  been  shown  that  almost  all  performance  enhancements  obtained 
by  using  Gaussian  entangled  bi-photon  imagers  over  thermal-light  sources  are  also  obtainable  by 
using  classically- correlated  Gaussian  states  with  phase-sensitive  correlations.  See  [69]  for  details. 
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(ii)  £  is  a  convex  linear  map  on  the  set  of  density  operators  pm  G  B(Hm),  i.e. 
£(J2kPkPk)  =  J2kPk£(Pk),  for  any  probability  distribution  {pk}. 

(iii)  £  is  a  completely  positive  map.  This  means  that  £  maps  positive  operators  in 
B{7im)  to  positive  operators  on  B(7iOVit),  and,  for  any  reference  system  R  and 
for  any  positive  operator  p  e  B(Hin  ®  R),  we  have  that  {£  <®  Ir)P  >  0  where  Ir 
is  the  identity  operator  on  R. 

It  can  be  shown  that  any  TPCP  map  can  be  expressed  in  an  operator  sum  representa¬ 
tion  [6],  £(p)  =  A-kpA^ki  where  the  Kraus  operators  Ak  must  satisfy  A\Ak  =  / 
in  order  to  preserve  the  trace  of  £(p). 

A.  1.4  Observables  and  measurement 

In  quantum  mechanics,  each  dynamical  observable  (for  instance  position,  momentum, 
energy,  angular  momentum,  etc.)  is  represented  by  a  Hermitian  operator  M.  Being  a 
Hermitian  operator,  M  must  have  a  complete  orthonormal  set  of  eigenvectors  (|0m)} 
with  associated  real  eigenvalues  (ftm  that  satisfy  M\(pm)  =  0m|0m)-  The  outcome  of 
a  measurement  of  M  on  a  quantum  state  p  always  leads  to  an  eigenvalue  (j)n  with 
probability,  p{n)  =  (0n|p|</>n).  Given  that  the  measurement  result  obtained  is  <f>n, 
the  post-measurement  state  of  the  system  is  the  eigenstate  \4>n)  corresponding  to  the 
eigenvalue  <pn .  This  phenomenon  is  known  as  the  “collapse”  of  the  wave  function. 
Thus,  if  the  system  is  in  an  eigenstate  of  a  measurement  operator  M  to  begin  with, 
the  measurement  result  is  known  with  certainty  and  the  measurement  of  M  doesn’t 
alter  the  state  of  the  system.  The  Hermitian  operator  H  corresponding  to  measuring 
the  total  energy  of  a  closed  quantum  system  is  known  as  the  Hamiltonian  for  the 
system.  The  measurement  of  an  observable  as  described  above  is  also  known  as  a 
projective  measurement,  as  the  measurement  projects  the  state  onto  an  eigenspace  of 
the  measurement  operator. 

In  analogy  to  the  evolution  of  an  open  system  described  above,  a  more  general 
measurement  on  a  system  entails  a  projective  measurement  performed  on  the  joint 
state  of  the  system  in  question  along  with  an  auxiliary  environment  prepared  in  some 
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initial  state.  This  general  measurement  scheme  can  be  described  by  a  set  of  positive 
semi-definite  operators  |flm|  that  satisfy  flm  =  I.  If  a  measurement  is  per¬ 
formed  on  a  quantum  state  p,  the  outcome  of  the  measurement  is  n  with  probability 
p{n)  =  Tr(pIIn).  The  above  description  of  a  quantum  measurement  is  known  as  the 
positive  operator-valued  measure  (POVM)  formalism  and  the  operators  jllmj  are 
known  as  POVM  operators.  The  POVM  operators  by  themselves  do  not  determine 
a  post-measurement  state.  We  use  the  POVM  formalism  throughout  the  thesis. 

A. 2  Quantum  entropy  and  information  measures 

Amongst  various  measures  of  how  mixed  a  quantum  state  p  is,  the  information- 
theoretically  most  relevant  one  is  the  von  Neumann  entropy  S(p),  which  is  defined 
as 


S(p)  =  — Tr(plnp)  (A. 6) 

=  (A. 7) 

where  77({An})  =  —  ^nAnlnAn  is  the  Shannon  entropy  of  the  eigenvalues  An  of 
p.  Hence,  it  is  obvious  that  the  von  Neumann  entropy  of  a  pure  state  is  zero,  i.e. 
<S'(|'0)(,0|)  =  0.  Most  of  quantum  information  theory  is  built  around  the  von  Neumann 
entropy  measure  of  a  quantum  state.  Below,  we  list  a  few  important  properties  of 
von  Neumann  entropy: 

A. 2.1  Data  Compression 

In  analogy  with  the  role  that  Shannon  entropy  plays  in  classical  information  theory, 
it  can  be  shown  that  S(pA )  is  the  optimal  compression  rate  on  the  quantum  system 
A  in  the  state  pA  G  BIBa)-  In  other  words,  for  large  n,  the  density  matrix  pA®n 
has  nearly  all  of  its  support  on  a  subspace  of  (called  the  typical  subspace )  of 
dimension  2nS^Al .  We  will  henceforth  use  the  notation  S'(A)  interchangeably  with 
S{pA )  to  mean  von  Neumann  entropy  of  the  system  A  (or  the  von  Neumann  entropy 
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of  the  state  pA ).  If  A  is  a  classical  random  variable,  we  use  the  function  H(A)  to 
denote  the  Shannon  entropy  of  A. 

A.  2. 2  Subadditivity 

The  joint  entropy  S' (A,  B )  of  a  bipartite  system  AB  is  always  upper  bounded  by  the 
sum  of  the  entropies  of  the  individual  systems  A  and  B,  i.e. 

S{A,B)<S{A)  +  S{B),  (A. 8) 

with  equality  when  the  joint  state  of  AB  is  a  product  state,  i.e.  pAB  =  pA  ®  pB . 
Another  well-known  inequality,  known  as  the  strong  subadditivity  of  von  Neumann 
entropy  is  given  by 

S{A,  B,  C )  +  S(B)  <  S(A ,  B)  +  S(B,  C),  (A.9) 

with  equality  when  the  tripartite  system  ABC  is  in  a  product  state,  i.e.  pABC  = 

pA  ®  pB  ®  pc . 

A. 2. 3  Joint  and  conditional  entropy 

The  entropy  of  a  bipartite  system  AB  in  a  joint  state  pAB  is  defined  as  S(A,B )  = 
— 1 Tr(pAB  In  pAB).  Even  though  there  is  no  direct  definition  of  quantum  conditional 
entropy  as  in  classical  information  theory,  one  may  define  a  conditional  entropy  (in 
analogy  to  its  classical  counterpart)  as  S(A\B)  =  S(A,  B )  —  S(B).  The  quantum  con¬ 
ditional  entropy  can  be  negative,  contrary  to  its  classical  counterpart4.  Furthermore, 
conditioning  can  only  reduce  entropy,  i.e.,  S(A\B,C)  <  S(A\B),  and  discarding  a 
quantum  system  can  never  increase  quantum  mutual  information  (see  Section  A. 2. 5), 
i.e.  I(A;B)  <  I(A;B,C). 

4For  the  bipartite  two-qubit  Bell  state  \ip)AB  =  (|00)  +  1 11))/ S(A\B)  =  S(A,B)  —  S(B)  = 
0  —  1  =  —1.  The  joint  state  of  the  system  AB  is  a  pure  state,  hence  S(A,  B )  =  0,  whereas  the  state 
of  system  B ,  pB  =  Tr a(pAB)  =  (|0)(0|  +  |l)(l|)/2  is  a  mixed  state  with  entropy  S(B)  =  1. 


168 


A. 2. 4  Classical-quantum  states 


We  define  here  the  notion  of  classical-quantum  states  and  classical-quantum  channels. 
To  any  classical  set  X,  we  associate  a  Hilbert  space  TLx  with  orthonormal  basis 
{\X)X} X&x'1  so  that  for  any  classical  random  variable  X  which  takes  the  values  x  G  X 
with  probability  p(x),  we  may  write  a  density  matrix 

PX  =  J ^p(x)\x)(x\x  =  0p(i) 

X  X 

which  is  diagonal  in  that  basis.  An  ensemble  of  quantum  states  can  be 

associated,  in  a  similar  way,  to  a  block  diagonal  classical- quantum  (cq)  state  for  the 
system  XB: 

PXB  =  ^P(x)\x)(x\x  (8)  p*  =  0 p(x)p% ,  (A. 10) 

X  X 

where  X  is  a  classical  random  variable  and  B  is  a  quantum  system,  with  conditional 
density  matrices  pf .  Then  the  conditional  entropy  S(B\X)  is  then, 

S(B|A')  =  y>(x)S(pf).  (A.  11) 

X 


A. 2. 5  Quantum  mutual  information 


The  quantum  mutual  information  I(A]B)  of  a  bipartite  system  AB  is  defined  in 
analogy  to  Shannon  mutual  information  as: 


I{A\B)  =  S(A)  +  S(B)-S(A,B) 
=  S(A)  —  S(A\B) 

=  S(B)  —  S(B\A). 


(A- 12) 
(A.13) 
(A.  14) 
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A  bipartite  product  mixed  state  pA  <g)  pB  has  zero  quantum  mutual  information.  The 
quantum  mutual  information  of  a  cq-state  (A.  10)  is  given  by 


I(X-B)  =  S{B)-S{B\X)  (A. 15) 

=  S  (j2p(x)PxJ  ~^P(x)S(px)  (A.16) 

-  x{p(x),pB),  (A.  17) 

where  x{p(x)iPx')  is  defined  as  the  Holevo  information  of  the  ensemble  of  states 
This  equivalence  between  the  input-output  quantum  mutual  informa¬ 
tion  I(X;B)  of  a  cq-system  and  the  Holevo  information  X  {p(x)i  Px)  will  be  used 
extensively  in  the  thesis. 


A. 2. 6  The  Holevo  bound 


Suppose  Alice  chooses  a  classical  message  index  x  G  X  with  probability  p(x)  and 
encodes  x  by  preparing  a  quantum  state  pA.  She  sends  her  state  to  Bob  through  a 
channel  £  which  then  produces  a  state  pB  =  £(pA)  at  Bob’s  end,  conditioned  on  the 
classical  index  x.  In  order  to  obtain  information  about  x,  Bob  measures  his  state  pB 
using  a  POVM  |  H7  j .  The  probability  that  the  outcome  of  his  POVM  measurement 
is  y  given  Alice  sent  x  is  given  by  p(y\x)  =  Tr(/5f  IIy).  Using  X  and  Y  to  denote  the 
random  variables  of  which  x  and  y  are  instances,  we  know  from  Shannon  information 
theory  that,  when  Bob  uses  the  POVM  j fly | ,  the  maximum  rate  at  which  Alice  can 
transmit  information  to  Bob  by  a  suitable  encoding  and  decoding  scheme  is  given  by 
the  maximum  of  the  mutual  information  I(X]Y)  over  all  input  distributions  p(x). 
Holevo,  Schumacher  and  Westmoreland  showed  [27,  28,  29]  that  for  a  given  prior  p(x) 
and  POVM  jriyj,  the  single- use  Holevo  information  is  an  upper  bound  on  Shannon 
mutual  information, 

I(X;Y)  <x(p(U,P?),  (A.  18) 
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which  is  known  as  the  Holevo  bound.  Maximizing  over  p(x)  on  both  sides,  one  gets 


max  I(X-,Y)  <  max y  (p(x),£(p^))  .  (A.  19) 

p(x)  p(x) 


As  the  right-hand  side  does  not  depend  on  the  choice  of  the  POVM  elements  |nyj, 
the  inequality  is  preserved  by  a  further  maximization  of  the  left  hand  side  over  the 
measurements, 


max  I(X]Y) 

p(x),{na} 

<  max*  (p(x),£(p*))  ,  or 

P(x) 

(A.  20) 

C1A{£) 

A  Cit oo(£), 

(A. 21) 

where  C\t\{£)  is  the  maximum  value  of  the  Shannon  Information  I(X\Y)  optimized 
over  all  possible  symbol-by-symbol  POVM  measurements  <J^ PIy  ^ .  Ci>00(£)  on  the  other 
hand,  is  the  maximum  value  of  the  Shannon  Information  I(X ;  Y)  optimized  not  only 
over  all  possible  symbol-by-symbol  POVM  measurements,  but  also  over  arbitrary 
multiple-channel-use  POVM  measurements.  As  we  will  see  below,  Ci>00(£)  is  the 
capacity  of  the  channel  £  for  transmission  of  classical  information  if  Alice  is  limited 
to  send  single-channel-use  symbols  p g  and  Bob  may  choose  any  joint  measurement 
at  the  receiver. 


A. 2. 7  Ultimate  classical  communication  capacity:  The  HSW 
theorem 

The  classical  capacity  of  a  quantum  channel  is  established  by  random  coding  argu¬ 
ments  akin  to  those  employed  in  classical  information  theory.  A  set  of  symbols  { j } 
is  represented  by  a  collection  of  input  states  {pj}  that  are  selected  according  to  some 
prior  distribution  {pj}.  The  output  states  {/))•}  are  obtained  by  applying  the  chan- 
nel’s  TPCP  map  £(')  to  these  input  symbols.  According  to  the  HSW  Theorem,  the 
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capacity  of  this  channel,  in  nats  per  use,  is 


C  =  sup (C'n.oo/ra)  =  sup{  max  [x{Pj,  £*"(&))/«]},  (A.22) 

n  n  {Pj.Pj} 

where  C'n)00  is  the  capacity  achieved  when  coding  is  performed  over  n-charmel-use 
symbols  and  arbitrary  joint-detection  measurement  is  used  at  the  receiver.  The  supre- 
mum  over  n  is  necessitated  by  the  fact  that  channel  capacity  may  be  super  additive, 
viz.,  CUt oc  >  nClj00  is  possible  for  quantum  channels,  whereas  such  is  not  the  case  for 
classical  channels.  The  HSW  Theorem  tells  us  that  Holevo  information  plays  the  role 
for  classical  information  transmission  over  a  quantum  channel  that  Shannon’s  mutual 
information  does  for  a  classical  channel. 


Neither  Eq.  (A. 17)  nor  Eq.  (A.22)  have  any  explicit  dependence  on  the  quan¬ 
tum  measurement  used  at  the  receiver,  so  that  measurement  optimization  is  implicit 
within  the  HSW  Theorem.  To  obtain  the  same  capacity  C  by  maximizing  a  Shannon 
mutual  information  we  can  introduce  a  positive-operator-valued  measure  (POVM) 
[6],  representing  the  multi-symbol  quantum  measurement  (a  joint  measurement  over 
an  entire  codeword)  performed  at  the  receiver.  For  example,  if  single-use  encoding 
is  performed  with  priors  {pj},  the  probability  of  receiving  a  particular  m-symbol 
codeword,  k  =  (ki,ki, ... ,  km),  given  that  j  =  (j \ ,  A ,  •  •  •  ,jm)  was  sent  is 


Pr(  k  |  j  )  =  Tr 


n. 


1=1 


(A. 23) 


where  the  POVM,  {Ilk};  is  a  set  of  Hermitian  operators  on  the  Hilbert  space  of 
output  states  for  m  channel  uses  that  resolve  the  identity.  From  {pj,Pr(k  |  j )}  we 
can  then  write  down  a  Shannon  mutual  information  for  single-use  encoding  and  m- 
syrnbol  codewords  that  must  be  maximized.  Ultimately,  by  allowing  for  n-charmel- 
use  symbols  and  optimizing  over  the  priors,  the  signal  states,  and  the  POVM,  we 
would  arrive  at  the  capacity  predicted  by  the  HSW  Theorem.  Evidently,  determining 
capacity  is  easier  via  the  HSW  Theorem  than  it  is  via  Shannon  mutual  information, 
because  one  less  optimization  is  required.  However,  finding  a  practical  system  that 
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can  approach  capacity  will  require  that  we  pay  attention  to  the  receiver  measurement. 


A. 3  Quantum  optics 

Classical  electromagnetic  (EM)  waves  in  free  space  in  the  absence  of  free  electrostatic 
charge  and  current  densities  are  governed  by  the  following  Maxwell’s  equations5: 


V  x  E(r,  t ) 

dH(r,  t ) 

-  ^  at 

(A. 24) 

V  •  e0E(r,  t) 

=  0 

(A. 25) 

V  x  H(r,t ) 

d E(r,  t ) 

~  t0  at 

(A. 26) 

V  -ix0H(r,t) 

=  0, 

(A. 27) 

where  E(r,t)  and  H(r,t)  are  the  electric  and  magnetic  field  intensity  vectors  in  free 
space  as  a  function  of  the  3D  spatial  coordinates  r  and  time  t.  The  permittivity  (eo) 
and  permeability  (/xo)  of  free  space  are  constants  satisfying  /x0eo  =  c-2,  where  c  is  the 
speed  of  light  in  vacuum.  General  solutions  to  these  equations  can  be  obtained  by 
introducing  a  vector  potential  A(r,  t)  defined  by  E  =  —dA/dt  and  H  =  (V  x  A)/no- 
By  working  in  the  Coulomb  gauge  (V-A  =  0),  it  is  straightforward  to  show  that 
A(r,t )  must  satisfy  the  vector  wave  equation 


V2A{r,t) 


1  d2A(r,t ) 
c2  dt2 


0. 


(A. 28) 


By  using  the  method  of  separation  of  variables  to  solve  for  the  complex  vector  poten¬ 
tial,  we  may  express  A(r,t )  =  qi^(t)uia(r)  so  that  Eq.  (A. 28)  is  now  expressed  as 
the  decoupled  mode  equations 

=  0,  and  (A. 29) 

=  0,  (A. 30) 

5The  development  of  field  quantization  in  this  section  has  been  taken  from  the  lecture  notes  of 
MIT  class  6.972,  Fall  2002,  taught  by  Prof.  Jeffrey  H.  Shapiro. 


vV, 


(r)  +  ~^ui,c 
<r 


tfqiA*)  ,  2  , 
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where  Eq.  (A. 29)  is  the  vector  Helmholtz  equation,  Eq.  (A. 30)  represents  the  dynamics 
of  a  simple  harmonic  oscillator  (SHO),  and  — c o\jc1  is  the  separation  constant  for  doing 
the  separation  of  variables.  The  spatial  mode  index  l  =  (lx,ly,lz)  is  a  triplet  of  non¬ 
negative  integers  (not  all  zero)  and  a  e  (0, 1)  is  a  polarization  mode  index.  Upon 
solving  with  the  simplest  boundary  conditions  in  3D  cartesian  coordinates,  i.e.,  the 
V  =  L  x  L  x  L  cubical  cavity,  we  obtain  the  following  solutions, 

UiAr)  =  JJij2e^kir>  eirJ  and  (A. 31) 

qi,a(t)  =  Qi,cre~3U)l%  for  t  >  0,  (A. 32) 


where  =  (27 tIx/L,  2/nly/L ,  2nlz/L)  is  the  wave  vector  for  the  spatial  mode  l,  satisfy¬ 
ing  ki  ki  =  (2tt/L)21-1  =  u2/c2.  Let  us  renormalize  the  harmonic  oscillator  temporal 
mode  function  qi^(t)  as  follows, 


(t) 


(A. 33) 
(A. 34) 


where  0{><T(f)  is  a  dimensionless  complex- valued  mode  function.  By  taking  the  appro¬ 
priate  derivatives  of  the  vector  potential,  we  can  compute  the  complex  electric  and 


magnetic  fields: 

E(r,t) 

l,(T 

(A. 35) 

H(r,t ) 

- at Te*“lt~k'^)  kt  x  e,i<7. 

(A. 36) 
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The  stored  energy  in  the  EM  field  in  the  cavity  is  given  by 

H  —  J  ^-e0 E-E  +  -hqH-h'J  dv,  which  simplifies  to  (A. 37) 
=  E^(ai%-a^)-  (A. 38) 

l, a 

Note  that  the  total  energy  is  time  independent  as  a\ a(t)a^a(t)  is  phase-insensitive. 
The  radiation  field  in  Eqs.  (A. 35)  and  (A. 36)  is  quantized  by  associating  operators 
di!<T(t)  with  normalized  SHO  mode  function  cqifT(t),  whose  real  and  imaginary  parts 
are  the  normalized  canonical  position  and  momentum  operators,  i.e., 

di,a(t)  =  au>(T(t)  +ja2i,cr(t),  (A. 39) 


where  the  quadrature  operators  of  the  same  spatial  mode  must  satisfy  the  canonical 
commutation  relation  [du^,  a2ji0-]  =  j/2.  The  field  operator  and  its  complex  conjugate 
for  a  pair  of  spatial  modes  must  thus  satisfy  the  commutation  relation 


(t)  ,  (t)  ^l.V  ^ (J «o'  • 


(A. 40) 


The  quantized  field  operators  and  the  Hamiltonian  (the  total  energy  operator)  are 
thus  given  by 


E(r,t) 
H(r,t ) 

H 


Y'  i,  /  lUJl  ( hi  p-i^h-ki  T)  _  -f  j(wit-krr) 

2e„L3  V'.'e  V 

l, a 


£x. 


2  uifi0L3 


?-j(uit-kvr) 


-d\aej{uit-ki  r)^j  kt  x 

\ ^  e 

/  ,  2  lai’aai,v  ai,<? 

I, a 

E  ^Ui  +  x 


E^  ^ 


(A.41) 

(A. 42) 
(A. 43) 

(A. 44) 

(A. 45) 
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where  Ni)0  =  a'l(TaitCr  is  the  photon  number  operator  for  the  mode  indexed  by  (Z,cr). 
It  is  evident  that  from  Eqs.  (A. 41)  and  (A. 42)  that  the  electric  and  magnetic  held 
operators  can  be  written  as  the  sum  of  a  positive-frequency  component  and  a  complex- 
conjugate  negative-frequency  component,  i.e., 

E(r,t)  =  EW(r,t)+E{~\r,t)> 

where  E (  \r,t)  =  _E(+)t(r,f)  and  H(  \r,t )  =  t). 

A. 3.1  Semiclassical  vs.  quantum  theory  of  photodetection: 
coherent  states 

Let  us  assume  that  only  one  polarization  is  excited,  the  only  excited  modes  are 
+z  going  plane  waves  with  wave-number  ui/c  —  ki  —  (2irl)/L]  l  e  {1,2,...},  i.e. 
lx  —  ly  —  0,  lz  =  /,  impinging  on  an  ideal  photodetector.  Also  assume  that  the  only 
modes  excited  lie  within  a  frequency  band  u0±Au,  with  Au  <C  u.  Further  assuming 
that  we  only  look  at  the  electric  held  in  the  time  window  to  <  t  <  to  +  T  where 
T  =  L/c,  and  normalizing  the  held  operator  to  A/photons/sec  units  by  integrating 
the  held  over  the  photosensitive  surface  of  the  photodetector,  we  have  for  the  positive- 
frequency  held  operator 

1  OO 

E{+\t)  =  -=  aie~j2nlt/T ,  for  t0  <  t  <  t0  +  T,  (A.48) 

V  l=  —  OO 

where  [an,atj  =  Snm.  Semiclassical  theory  predicts  the  photocurrent  i(t)  to  be  an 
inhomogeneous  Poisson  impulse  train  with  rate  function  q\E(t)\2,  given  that  the  de¬ 
tector  is  illuminated  by  a  deterministic  classical  held  E(t).  The  noise  inherent  to  this 
Poisson  process  is  what  defines  the  shot-noise  limit  of  semiclassical  photodetection. 
Quantum  theory  of  photodetection,  on  the  other  hand,  predicts  the  photocurrent 
produced  by  the  ideal  photodetector  to  be  a  stochastic  process  whose  statistics  are 
those  of  the  Hermitian  photocurrent  operator  i(t)  =  qE^+^(t)E^+\t).  Just  like  the 


(A. 46) 
(A. 47) 
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measurement  of  any  other  dynamical  observable  in  the  framework  of  quantum  me¬ 
chanics,  the  photocurrent  statistics  are  governed  by  the  quantum  state  of  the  held. 
Non-classical  states  of  the  held  such  as  photon  number  states,  quadrature  squeezed 
states,  etc.,  do  not  obey  the  photocurrent  statistics  predicted  by  the  semiclassical 
theory.  We  define  classical  states  of  the  held  to  be  those  whose  photocurrent  mea¬ 
surement  statistics  predicted  by  the  quantum  theory  comply  with  what  is  predicted 
by  the  semiclassical  theory.  Such  states  are  known  to  be  coherent  states,  and  are 
eigenstates  of  the  positive-held  operator  E^+\t)  indexed  by  the  complex  amplitude 
of  the  held  E^+\t).  The  general  multi-mode  coherent  state  of  the  held  F)l+\r,t)  is 
given  by 


l, a 

=  \E(+\r,t)). 


(A. 49) 
(A. 50) 


where  cq)(T|cq)(T)j)(T  =  cqi(T|a;z)0-)z)0-  is  satisfied  for  each  mode  ( l,  a ).  It  is  easily  verified 
that  the  multi-mode  coherent  state  is  an  eigenstate  of 


Ei+\r,t )  =  ^J 

l,( 7 


hui 
2  enT3 


(hz,c 


?-j(uit-krr) 


)ez„ 


he., 

Ei+\r,t)\E(+\r,t))  =  E{+\r,t)\E{+\r,t)),  (A.51) 

with  eigenfunction  Ei+\r,t)  =  {oii^e~j^lt'kl'r))  e^. 


A. 3. 2  Photon-number  (Fock)  states 

Photon-number  states  (or  Fock  states)  are  states  of  the  quantized  held  that  have  a 
hxed  number  of  photons  in  each  mode,  i.e.  the  measurement  statistics  of  an  ideal 
photodetector  on  a  Fock  state  is  deterministic.  A  multi-mode  Fock  state  is  given  by 
the  tensor  product 

|n>  =  0  Ml,,:  (A. 52) 

Z,< 7 
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in  which  each  single-mode  Fock  state  is  the  eigenstate  of  the  corresponding 

mode’s  photon  number  operator  Nit<J  =  a\adi:<r,  i.e., 

l,(T  ^b,<T  |  ^b,cr)[,<x,  (A. 53) 


for  nLa  G  {0, 1,2, . . .}. 

A. 3. 3  Single-mode  states  and  characteristic  functions 

In  all  that  follows,  we  shall  drop  the  mode-index  subscripts  ( l ,  a)  and  will  refer  only  to 
a  single  mode  of  the  bosonic  field,  unless  noted  otherwise.  A  single  mode,  as  we  have 
seen,  is  characterized  by  the  non-Hermitian  operator  a,  whose  eigenstates  |a),  a  6  C 
are  classical  states,  i.e.,  they  yield  Poisson  statistics  for  an  ideal  photon-counting 
measurement.  The  photon  number  operator  N  =  a' a  is  a  Hermitian  operator  whose 
measurement  counts  the  number  of  photons  in  the  mode.  Its  eigenstates  \n) ,  n  G 
(0, 1, . . .}  are  called  Fock  states  or  photon-number  states,  and  they  are  non-classical 
states.  It  can  be  easily  verified  that  the  field  operator  a  takes  a  Fock  state  | n)  to  a 
Fock  state  with  one  less  number  of  photons,  \n  —  1),  and  the  conjugate  operator  a) 
takes  a  Fock  state  | n)  to  another  Fock  state  with  one  additional  number  of  photons 
| n  +  1),  i.e. 


d\n)  =  y/n\n  —  1) 

(AM) 

a)\n)  =  n  +  l\n  +  1) . 

(A.55) 

Because  of  the  above  property,  we  shall  call  the  operator  a  the  annihilation  operator 
and  a)  the  creation  operator  of  the  mode.  They  are  sometimes  also  known  as  ladder 
operators.  The  Fock  states  form  a  complete  orthonormal  (CON)  basis  for  all  states 
of  a  single-mode  bosonic  field,  viz.,  (m\n)  =  5mn  and  I  =  ^n|n)(n|,  for  I  the 
identity  operator.  Therefore,  coherent  states  can  be  expanded  in  the  Fock  basis.  Not 
surprisingly,  we  obtain 


I  «>  =  £ 


71=0 


e-H2/2an 

a/uT 


n), 


(A. 56) 
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confirming  the  fact  that  the  probability  of  counting  m  photons  when  a  single-mode 
coherent  state  is  subject  to  ideal  photon  counting  measurement  is  given  by  the  Poisson 
formula  p(m)  =  \oi\2rn/m\.  The  displacement  operator  is  defined  as 

D{a)  =  exp(aa^  —  a*a ).  (A. 57) 


It  displaces  the  vacuum  state  to  a  coherent  state,  H(a)|0)  =  |a).  Coherent  states 
do  not  form  an  orthonormal  set,  unlike  number  states.  The  inner  product  of  two 
coherent  states  is  given  by 


(a  |/3)  =  exp 


ot*P  -  ^(M2  +  \P\2) 


(A. 58) 


and  the  squared  magnitude  of  the  inner  product  is  given  by  |(o;|/3)|2  =  e~^a~^2,  so 
that  |a)  and  \/3)  are  nearly  orthogonal  when  \a  —  (3\  1.  The  coherent  states  form 
an  overcomplete  basis  of  the  single-mode  state  space,  i.e.,  they  resolve  the  identity 
via 

/i  2  <“*"1 

| a) (a | — -  =  \n)(n\.  (A. 59) 

71  n= 0 

The  thermal  state  of  a  mode  with  annihilation  operator  a  is  an  isotropic  Gaussian 
mixture  of  coherent  states,  i.e., 

r  e-N2/N 

Pt=  nN  |«)(«|d2a,  (A. 60) 

where  N  =  (N)  is  the  average  photon  number  in  the  state  pr-  The  thermal  state 
can  also  be  equivalently  expressed  as  a  statistical  mixture  of  Fock  states  with  a  Bose- 
Einstein  distribution,  i.e., 


Nr 


Pt  — 


OO 


|n)(n|. 


(A. 61) 


From  Eq.  (A. 61)  we  immediately  have  that  the  von  Neumann  entropy  of  the  thermal 
state  S(pr )  =  g(N)  —  (1  +  N)  ln(l  +  N)  —  N  In  N,  because  the  photon-number  states 
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are  orthonormal. 

We  define  three  kinds  of  characteristic  functions  for  a  single-mode  state  p: 

1.  Normally  ordered:  Xn(0  =  Tr(/5e^ate~^*“)  =  e^2//2(.D(£)), 

2.  Anti-normally  ordered:  Xa(0  —  Tr(pe“^*ae^at)  =  e_^2//2(.D(C)), 

3.  Wigner:  X$v(0  =  Tr(pe-^+<at)  =  <D(C)>. 

As  is  evident  from  the  definitions  above,  if  one  of  the  characteristic  functions  is 
known,  the  others  can  be  computed  easily.  As  examples,  the  antinormally-ordered 
characteristic  function  for  a  coherent  state  |a)  is  ed**_<FQHCI2;  for  the  thermal  state 
with  mean  photon  number  N  it  is,  g-d+AOICh  and  for  the  vacuum  state  it  is  e_^“. 
The  Husirni  function  Qp(a)  =  (a\p\a) /n  is  a  proper  probability  distribution  over  the 
complex  plane  a  G  C  and  is  the  2D  Fourier  transform  of  the  antinormally  ordered 
characteristic  function  Xa( 0,  i-e-> 

Xa(()  =  f  ft(a)ec“*_c‘"cl2a  (A. 62) 

Qf{a)  =  F  J  xyC)e-c“'+c‘“d2C.  (A. 63) 

The  state  p  can  be  retrieved  from  Xa(0  as  follows 

i>  =  J  (A. 64) 

A. 3.4  Coherent  detection 

Besides  the  photon  counting  measurement  of  an  optical  field  that  we  described  above, 
the  most  commonly  used  optical  detection  schemes  are  the  coherent-detection  tech¬ 
niques,  known  as  homodyne  and  heterodyne  detection. 

1.  Homodyne  detection  —  Homodyne  detection  is  used  to  measure  a  single  quadra¬ 
ture  of  the  field.  The  measurement  corresponds  to  measuring  the  Hermitian 
quadrature  operator  K(ae_J,e).  The  actual  realization  of  a  homodyne  detector 
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N+  -  AG 
2\/^lo 


■  Semiclassical  description:  ae  ~  N{Re(ae~ie),  1/4) 


■  Quantum  description:  a#  « - »  olq  =  Re(ae  lt>) 


Figure  A-l:  Balanced  homodyne  detection.  Homodyne  detection  is  used  to  measure 
one  quadrature  of  the  field.  The  signal  field  a  is  mixed  on  a  50-50  beam  splitter  with 
a  local  oscillator  excited  in  a  strong  coherent  state  with  phase  9,  that  has  the  same 
frequency  as  the  signal.  The  outputs  beams  are  incident  on  a  pair  of  photodiodes 
whose  photocurrent  outputs  are  passed  through  a  differential  amplifier  and  a  matched 
filter  to  produce  the  classical  output  ag.  If  the  input  a  is  in  a  coherent  state  |o),  then 
the  output  of  homodyne  detection  is  predicted  correctly  by  both  the  semiclassical 
and  the  quantum  theories,  i.e. ,  a  Gaussian-distributed  real  number  ag  with  mean 
acosB  and  variance  1/4.  If  the  input  state  is  not  a  classical  (coherent)  state,  then  the 
quantum  theory  must  be  used  to  correctly  account  for  the  statistics  of  the  outcome, 
which  is  given  by  the  measurement  of  the  quadrature  operator  9fJ(ae-J'0). 


is  depicted  in  Fig.  A-l.  If  the  input  a  is  in  a  coherent  state  |o),  then  the  out¬ 
put  of  homodyne  detection  is  a  Gaussian  distributed  real  number  ag  with  mean 
acos9  and  variance  1/4.  If  the  local  oscillator  phase  9  =  0,  homodyne  detection 
measures  hi,  the  real  quadrature  of  the  field.  If  the  detected  state  is  a  Gaussian 
state  (see  next  section),  then  the  outcome  of  homodyne  measurement  is  a  real 
Gaussian  random  variable  with  mean  (hi)  and  variance  (A hf)  =  ((hi  —  (hi))2). 

2.  Heterodyne  detection  —  Heterodyne  detection  is  used  to  measure  both  quadra¬ 
tures  of  the  bosonic  field  simultaneously.  For  a  general  input  state  p,  the  out¬ 
come  of  heterodyne  measurement  (01,0:2)  has  a  probability  distribution  given 
by  the  Husimi  function  of  p  given  by  Qp(a)  =  (o|p|o)/7r.  If  the  input  is  a  co¬ 
herent  state  |o),  then  the  outcome  of  measurement  is  a  pair  of  real  variance-1/2 
Gaussian  random  variables  with  means  (3?(o),  3f(o)). 
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4  W 


■  Quantum  description:  a  < - >  a 


Figure  A-2:  Balanced  heterodyne  detection.  Heterodyne  detection  is  used  to  measure 
both  quadratures  of  the  field  simultaneously.  The  signal  field  a  is  mixed  on  a  50-50 
beam  splitter  with  a  local  oscillator  excited  in  a  strong  coherent  state  with  phase 
6  =  0,  whose  frequency  is  offset  by  an  intermediate  (radio)  frequency,  cuif,  from 
that  of  the  signal.  The  outputs  beams  are  incident  on  a  pair  of  photodiodes  whose 
photocurrent  outputs  are  passed  through  a  differential  amplifier.  The  output  current 
of  the  differential  amplifier  is  split  into  two  paths  and  the  two  are  multiplied  by  a  pair 
of  strong  orthogonal  intermediate-frequency  oscillators  followed  by  detection  by  a  pair 
of  matched  filters,  to  yield  two  classical  outcomes  ol\  and  0:2-  If  the  input  is  a  coherent 
state  |a),  then  both  semiclassical  and  quantum  theories  predict  the  outputs  (ai,a2) 
to  be  a  pair  of  real  variance-1/2  Gaussian  random  variables  with  means  (9?(a),  3(a)). 
For  a  general  input  state  p,  the  outcome  of  heterodyne  measurement  (aq,a2)  has  a 
distribution  given  by  the  Husirni  function  of  p  given  by  Qp(ce)  =  (a|p|a)/7T. 
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A. 3.5  Gaussian  states 


For  a  single- mode  state  p,  let  us  define  the  mean  field  (a)  =  Tr (pa)  and  the  covariance 
matrix, 


K  = 


(AaAad)  (A  a2) 

(Aat2)  (AatAa) 


(A. 65) 


where  Ad  =  a  —  (a).  The  commutation  relation  [a,  <d]  =  1  implies  that  (Ad  A  a')  = 
1  +  (AcdAa).  Also,  the  off-diagonal  terms  are  complex  conjugates  of  each  other,  i.e. , 
(A a,'2)  =  (Ad2)*.  Thus,  the  covariance  matrix  takes  a  form, 


(  l  +  N  P 

y  p*  n 


(A. 66) 


For  a  zero  mean  field  ((a)  =  0)  state,  (Acd Aa)  =  (a' a)  is  the  mean  photon  number 
in  the  state.  Also,  for  states  with  (a)  =  0,  the  correlation  matrix 


(aa))  (a2)  \ 

(fif2)  (a)a)  J 


(A. 67) 


is  identical  to  the  covariance  matrix  K  defined  in  Eq.  (A. 65).  The  symmetrized 
covariance  matrix  is  defined  as  Kg  =  K  —  Q/ 2,  where 


Q  = 


(A. 68) 


The  Wigner  covariance  matrix  (or  the  quadrature  covariance  matrix)  is  another  equiv¬ 
alent  form  of  the  covariance  matrix  of  p  and  is  given  by 


(Aa2)  |(AaiAa2  +  Aa2Aai) 

|(AfiiAa2  +  Aa2Afii)  (Aa|) 


Vj 

v12 

v12 

v2 
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where  a  =  d\  +  ja2,  Aai  =  ai  —  (di)  and  Aa2  =  a2  —  (a2).  The  relationship  between 
these  different  forms  of  the  covariance  matrix  is  given  by 


UKqU *  =  Ks, 


(A. 70) 


where 


(A.71) 


satisfies  U^U  =  21,  so  that  it  is  a  scaled  unitary  matrix.  The  relationship  between  the 
elements  of  Kq  and  K  work  out  to  be  N  +  1/2  =  V\  +  V2  and  P  =  {V\  —  V2)  Jr2jVlV2. 


One  definition  of  a  bosonic  Gaussian  state  is  a  state  p  whose  Wigner  characteristic 
function  Xw( C)  =  Tr  j  js  quadratic  in  (C,C*)-  An  equivalent  definition  of 

a  Gaussian  state  is  a  state  that  is  completely  described  by  only  the  first  and  second 
moments  of  the  held. 


Theorem  1.1  —  The  Wigner  characteristic  function  Xw( 0  °f  a  single-mode  Gaussian 
state  p  with  complex  mean  (a)  =  a  and  covariance  matrix  (A. 66),  is  given  by 


Xw(  0  =  exP 


(WC  -  «C)  +  M(P*(2)  -(N  +  ^)|C|2 


(A. 72) 


Proof  —  Expressing  the  Wigner  characteristic  function  Xw(0  =  Tr  C*a+Catj  in 
terms  of  the  real  and  imaginary  parts  of  C  —  Ci  +  j C2 j  we  have 


In 


Xw( Ci)  C2) 


In  [(exp  (—2j(id2  +  2j(2d,1))p\ . 


(A. 73) 


Note  that  x^(0,0)  =  1.  For  a  function  /( Ci?  C2)?  such  that  /(0,  0)  =  1,  we  have  the 
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following  Taylor  series  expansion  for  ln(/(Cn,C2))  around  (C1X2)  =  (0,0): 


0r(/(Cii C2))  —  Ci/c, (0, 0)  +  (2f(,(0,0)  +  y  [Ci2(/ci<i(0,0)  -  /<,(0,0)2) 
+Ci&(/<lG(0,0)  -  2/Cl(0,0)/Cl(0,0)  +  /6Cl(0,0)) 
+C22(/&Cl(0,0)-/Cl(0,0)2)] 

+h.o.t.  (A. 74) 

Let  us  assign  /(Ci,  C2)  =  Xw(C  i>C2)  =  (exp  {-2j(ia2  +  2j(2ai)),  where  the  expecta¬ 
tion  is  taken  in  the  state  p.  As  p  is  a  Gaussian  state,  the  Wigner  characteristic  function 
must  be  a  quadratic  in  (Ci,C2)  by  definition.  Hence,  the  expansion  in  Eq.  (A. 74)  is 
exact  without  the  h.o.t.  (higher  order  terms).  The  partial  derivatives  of  /( Ci,  C2)  are 
given  by: 


/ci  (Cij  C2)  = 

(— 2ja2e“2-'A“2+2^2«i) 

(A.75) 

/da  (Cl,  C2)  = 

(2ja1e~2Kld2+2K2&1) 

(A. 76) 

/ciCi  (Cij  C2)  = 

(-4a^e_2jCl“2+2^2“i) 

(A. 77) 

/C2C2  (Cl)  C2) 

(— 4a2e“2i<ia2+2A2ai) 

(A. 78) 

/ciC2(Ci5  C2)  = 

((-2ja2)(2ja1)e-2^ia2+2^ai) 

(A. 79) 

/c2ci  (Ci?  C2)  = 

((2ja1)(-2ja2)e-2^a2+2A2a1) 

(A. 80) 

Evaluating  each  partial  derivative  at  (0,  0)  and  substituting  in  Eq.  (A. 74)  we  get 

ln(/ (Ci ?  C2))  =  2 j £i (62)  +2X2(01)  +  -  [Ci2  {— 4(o2)  +4(a2)2) 

+C1C2  (4(a2Oi)  —  8(a2ai)  +  4(aia2}) 

+C2  (— 4(ai)  +  4(Si)^)]  (A. 81) 

=  (— 2Xia2  +  2jC2«i)  +  2  (— Ci(Aa|)  -  C 2^) 

+CiC2((o2Qi)  —  2(a2)(ai)  +  (S1O2)) ,  (A. 82) 

where  we  used  (aq,  a2)  to  denote  the  real  and  the  imaginary  parts  of  a.  We  can  express 
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Xw(C ij  C2)  in  terms  of  the  entries  of  the  Wigner  covariance  matrix  Kq ,  by  observing 
that  V12  =  |(AaiAa2  +  Ac^Acii)  =  ((0201)  —  2(0,2)  (di)  +  (0,10,2)) /2.  Therefore, 

In  C2)]  =  [(-2jCi«2  +  2jC2«i)  -  2  (CjV2  +  C22K  -  2C1C2W2)]  ,  (A.83) 

which  implies, 

Xw(C\,  C2)  =  exp  [(-2jCi«2  +  2jC2«i)  -  2  (CjV2  +  C2Vi  -  2C1C2IA2)]  ,  (A.84) 

Substituting  Ci  =  (C  +  0/2,  C2  =  (C  -  0/2.7,  A  +  1/2  =  W  +  F2  and  P  =  O  - 
V2)  +  2jV\  V2,  we  can  express  x^(C)  in  terms  of  entries  of  the  covariance  matrix  K  as 
follows, 

Xw(0  —  exP  (a,C-aC,)  +  »(l”C2)-(A,  +  t)|C|2  .  (A.85) 

Multi-mode  Gaussian  states  and  the  symplectic  diagonalization6  —  Let  us 

introduce  vector-valued  annihilation  operators  by  stacking  the  annihilation  operators 
of  A  independent  modes  as  follows, 

a  =  [01 . . .  d]y]T  (A. 86) 


is  an  A  x  1  column  vector  of  annihilation  operators.  Similarly,  the  column  vector  of 
creation  operators  is  denoted 

a1-  =  [a| . . .  ajv]T.  (A. 87) 

With  no  loss  of  generality  let  us  initially  restrict  our  attention  to  zero-mean  Gaussian 
states  of  A  modes,  such  that  the  state  is  completely  characterized  by  the  2A  x  2A 
correlation  matrix 


6The  author  thanks  his  colleague  Baris  I.  Erkmen  for  this  section,  which  has  been  partly  adapted 


from  [12] 
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where  In  is  an  N  x  N  identity  matrix  and  *  refers  to  element-wise  complex  conjuga¬ 
tion. 


Theorem  1.2  —  Let  a  =  [Si . . .  Stv]t  be  N  modes  of  a  field  that  are  in  a  zero-mean 
Gaussian  state  with  2 N  x  2N  correlation  matrix  R,  as  given  in  (A. 88).  Then,  there 
exists  S  e  C2Nx2N  and  A  e  C2Nx2N ,  such  that 

R  =  SASt ,  (A. 89) 


where  S^QS  =  SQS^  =  Q  and  A  =  diag{Ai  —  1 . .  A.\  —  1  -  A i . Atv},  with 


Q 


I N  0 

0  —  Ijv 


(A. 90) 


and  Ai, . . . ,  Aat  >  0. 


Proof  —  We  use  Williamson’s  symplectic  decomposition  theorem  on  the  symmetrized 
(real- valued)  correlation  matrix  for  the  quadratures,  &i  =  [&  +  cd\/ 2  and  a-2  = 
[d  —  ad]/2i,  of  the  annihilation  operators  [83].  Then  the  expressions  in  the  theorem 
are  obtained  by  transforming  this  quadrature  correlation  matrix  decomposition  into 
the  annihilation  operator  correlation  matrix  via  the  transformation 


U  = 


In  Hn 
I  AT  —  'Hn 


(A. 91) 


The  strength  of  a  symplectic  decomposition  is  the  expansion  of  a  into  a  new  set 
of  unsqueezed  modes  with  average  photon  number  Xn,  n  =  1, . . . ,  N  per  mode. 

Corollary  1.3  —  Let  a  =  [Si . .  .ajv]T  be  in  an  arbitrary  A-mode  Gaussian  state 
with  mean  (a)  and  covariance  matrix  R.  Then  a  can  be  obtained  via  a  symplectic 
transformation  on  an  Wmode  held  d  that  is  in  a  tensor  product  of  N  uncorrelated 
thermal  (Gaussian)  states. 
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Proof  —  Consider  the  following  linear  transformation  on  a: 


d 

=  S'1 

a 

dt 

a* 

(A. 92) 


where  S  1  =  QS^Q  is  the  inverse  of  the  symplectic  matrix  that  diagonalizes  R. 
Utilizing  the  symplectic  diagonalization  of  R,  we  find  that 


Rd  =  A  .  (A. 93) 

Consequently,  dn  has  average  photon  number  {d'ndn)  =  An,  for  n  =  1, . . . ,  N,  where 
Xn  >  0  are  the  symplectic  eigenvalues  of  R  found  in  Theorem  1.2.  Furthermore,  all 
modes  {dn}  are  uncorrelated.  Therefore,  each  mode  can  be  represented  as  an  isotropic 
mixture  of  coherent  states  displaced  by  the  corresponding  mean,  and  the  joint  state 
is  the  tensor  product  of  N  such  states. 

Corollary  1.4  —  Let  d  =  [d\ . . .  djv]T  be  N  modes  in  an  arbitrary  state.  A  symplectic 
transformation  on  the  IV-modes,  mapping  d  into  a,  as 

a 
at 

does  not  alter  the  von-Neumann  entropy  of  the  state;  i.e.  if  pd  and  pa  denote  input 
and  output  the  density  operators  respectively,  then  S(pd )  =  S(pa). 

Proof  -  The  symplectic  transformation  given  in  (A. 94)  is  a  canonical  transforma¬ 
tion,  i.e.,  it  preserves  the  commutation  relations.  Thus  it  can  be  implemented  with 
a  unitary  operator  U,  satisfying  UU'  =  U'U  =  /  [84],  The  theorem  and  corollaries 
collectively  show  that  an  arbitrary  AC  mode  Gaussian  state  can  always  be  linearly 
transformed  into  a  tensor  product  of  N  thermal  states  with  no  change  in  the  entropy 
of  the  joint  state. 

As  a  simple  example,  using  the  symplectic  diagonalization  of  a  single-mode  zero- 
mean  Gaussian  state  p  whose  covariance  matrix  is  given  by  Eq.  (A. 66),  a  unitary 
squeezing  transformation  exists  that  transforms  p  to  a  zero-mean  thermal  state  Pt,n, 


d 

dX 


(A. 94) 
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i.e.,  p  =  U Pt,nU '  where  Pt.n  is  a  zero-mean  thermal  state  with  mean  photon  number 
N  =  sj (N  +  1/2)2  —  |P|2  —  1/2.  Thus  the  von  Neumann  entropy  of  a  Gaussian  state 
whose  covariance  matrix  is  given  by  Eq.  A. 66,  is  given  by  S(p)  =  g{N). 
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Appendix  B 


Capacity  region  of  a  degraded 
quantum  broadcast  channel  with 
M  receivers 


In  this  appendix,  we  generalize  the  capacity  region  of  the  two-receiver  quantum  de¬ 
graded  broadcast  channel  proved  by  Yard  et.  al.  [52],  to  an  arbitrary  number  of  re¬ 
ceivers.  In  chapter  3,  we  postponed  the  general  proof  of  the  capacity  region  to  this 
appendix,  but  we  used  this  result  to  evaluate  the  capacity  region  of  the  Bosonic  broad¬ 
cast  channel  with  an  arbitrary  number  of  receivers.  For  the  sake  of  completeness,  and 
ease  of  reading,  we  restate  the  set-up  of  the  problem  and  go  through  the  notation 
before  we  do  the  proof. 

B.l  The  Channel  Model 

The  M- receiver  quantum  broadcast  channel  Ma-y0...yM-i  is  a  quantum  channel  from 
a  sender  Alice  (A)  to  M  independent  receivers  Y0, ,  YM_\.  The  quantum  channel 
from  A  to  Y0  is  obtained  by  tracing  out  all  the  other  receivers  from  the  channel 
map,  i.e. ,  Ma-y0  =  Trylr..;yM_1  {Ma-y0...ym_^),  with  a  similar  definition  for  ATy.  for 
k  G  {1, . . . ,  M  —  1}.  We  say  that  a  broadcast  channel  N a-y0...yM-i  is  degraded  if  there 
exists  a  series  of  degrading  channels  J\fy^Yk+i  from  Y*,  to  Yk+ 1,  for  k  G  {0, . . . ,  M  —  2}, 
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satisfying 


JVW>  =  KL-v„-,  ° 0...0 <=£*  o Ma-y0-  (B.l) 

The  M-receiver  degraded  broadcast  channel  (see  Fig.  B-l)  describes  a  physical  sce¬ 
nario  in  which  for  each  successive  n  uses  of  the  channel  Ma-y0...ym_1  Alice  communi¬ 
cates  a  randomly  generated  classical  message  (mo, . . .  ,  rriM-i)  €  (Wo, . . . ,  Wm- i)  to 
the  receivers  Y0, . .  .,YM_ i,  where  the  message-sets  Wk  are  sets  of  classical  indices  of 
sizes  2nRk,  for  k  e  {0, . . M  —  1}.  The  messages  (m0t . . . ,  m-M- i)  are  assumed  to  be 
independent  and  uniformly  distributed  over  (Wo, . . . ,  Wm- i),  he. 

M—l  M—l  ^ 

Pw0,...,wM-i(mO:  ■  ■  -,mM- 1)  =  Yl  Pwk(mk)  =  (B.2) 

fc=0  fc=0 

Because  of  the  degraded  nature  of  the  channel,  given  that  the  transmission  rates 
are  within  the  capacity  region  and  proper  encoding  and  decoding  is  employed  at 
the  transmitter  and  at  the  receivers,  Y0  can  decode  the  entire  message  M-tuple 
(m0, . . . ,  rriM-i),  ^1  can  decode  the  reduced  message  (M  —  l)-tuplc  (mi, . . . ,  uim-i), 
and  so  on,  until  the  noisiest  receiver  YM_  1  can  only  decode  the  single  message-index 
rriM-i ■  To  convey  the  message-set  niQ/_1,  Alice  prepares  n-channcl  use  states  that,  af¬ 
ter  transmission  through  the  channel,  result  in  M-partite  conditional  density  matrices 
s  p  f ,  Vm()  e  Wq  _1.  The  quantum  states  received  by  a  receiver,  say  Y0  can 

L  mo  J 

Yn  (  Y(C'...Yn  \ 

be  found  by  tracing  out  the  other  receivers,  viz.  p  °M_1  =  Tryn  yn_  \P  m- iM_1  ), 

mO  1  M  1  V  mo  / 

etc.  Fig.  B-2  illustrates  this  decoding  process. 


B.2  Capacity  Region:  Theorem 

A  (2nfi°, . . . ,  2hRm-1  ,  n,  e)  code  for  this  channel  consists  of  an  encoder 


xn  .  (Wf-1)  An, 


(B.3) 
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Tm-i 


<t-Yo-YZ-i  \ 
\Pj 


Auxiliary  complex-valued  random  variables 


Transmitter 


Degraded  receivers 


Received  state 


Figure  B-l:  This  figure  summarizes  the  setup  of  the  transmitter  and  the  channel 
model  for  the  M- receiver  quantum  degraded  broadcast  channel.  In  each  successive 
n  uses  of  the  channel,  the  transmitter  A  sends  a  randomly  generated  classical  mes¬ 
sage  (mo,  •  •  • ,  rriM-i)  £  (Wo, . . . ,  Wm- i)  to  the  M  receivers  Y0, . . .,  Ym- i,  where  the 
message-sets  W&  are  sets  of  classical  indices  of  sizes  2nRk,  for  k  e  {0 , ,M  —  1}. 
The  dashed  arrows  indicate  the  direction  of  degradation,  i.e.  Y0  is  the  least  noisy 
receiver,  and  Ym_i  is  the  noisiest  receiver.  In  this  degraded  channel  model,  the 
quantum  state  received  at  the  receiver  Y^,  pYk  can  always  be  reconstructed  from  the 
quantum  state  received  at  the  receiver  Y^,  p*k' ,  for  k!  <  k,  by  passing  jAk'  through 
a  trace-preserving  completely  positive  map  (a  quantum  channel).  For  sending  the 
classical  message  (m0, . . . ,  rriM-i)  —  j,  Alice  chooses  a  n-use  state  (codeword)  pf" 
using  a  prior  distribution  pj\u ,  where  4  denotes  the  complex  values  taken  by  an  aux¬ 
iliary  random  variable  T}..  It  can  be  shown  that,  in  order  to  compute  the  capacity 
region  of  the  quantum  degraded  broadcast  channel,  we  need  to  choose  M  —  1  com¬ 
plex  valued  auxiliary  random  variables  with  a  Markov  structure  as  shown  above,  i.e. 
Tm~  i  — >  Tm- 2  — >  . . .  — >  Tf~  — >  . . .  — >  T\  — >  An  is  a  Markov  chain. 
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,0  -0 
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\  rriM- 1  J 


Degraded  receivers 


Quantum  measurement 
(POVM  elements) 


Decoded  messages  (estimates) 


Figure  B-2:  This  figure  illustrates  the  decoding  end  of  the  M-receiver  quantum 
degraded  broadcast  channel.  The  decoder  consists  of  a  set  of  measurement  oper¬ 
ators,  described  by  positive  operator-valued  measures  (POVMs)  for  each  receiver; 

’  ■  ■  ■>  {AmM^}  on  Yin ,  •  •  •,  Ym- in  respectively.  Be¬ 

cause  of  the  degraded  nature  of  the  channel,  if  the  transmission  rates  are  within  the 
capacity  region  and  proper  encoding  and  decoding  are  employed  at  the  transmitter 
and  at  the  receivers  respectively,  Y0  can  decode  the  entire  message  M-tuple  to  ob¬ 
tain  estimates  (rrig, . . . ,  m^f_1),  Y\  can  decode  the  reduced  message  (M  —  l)-tuple  to 
obtain  its  own  estimates  (rh\, . . .  ,mjv/_1),  and  so  on,  until  the  noisiest  receiver  YM_ i 
can  only  decode  the  single  message-index  tiim-\  to  obtain  an  estimate  rh^z\-  Even 
though  the  less  noisy  receivers  can  decode  the  messages  of  the  noisier  receivers,  the 
message  is  intended  to  be  sent  to  receiver  Ykl  \/k.  Hence,  when  we  say  that  a 
broadcast  channel  is  operating  at  a  rate  (R0, . . . ,  Rm- i),  we  mean  that  the  message 
mfc  is  reliably  decoded  by  receiver  Yk  at  the  rate  Rk  bits  per  channel  use. 
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a  set  of  positive  operator- valued  measures  (POVMs)  —  |A^o ...TOm_1|,  |A^1i  TOm  i|, 
. . on  y0n,  yin ,  . . yM-i1  respectively,  such  that  the  mean  probability 
of  a  collective  correct  decision  satisfies 


(  M—l 


Tr 


P<-' 


mk...rriM- 


>l-e, 


(B-4) 


for  VitLq7  1  G  Wf  b  A  rate  M-tuplc  (i?0, . . .  ,  i?Ar-i)  is  achievable  if  there  exists  a 
sequence  of  (2nfi°, . . . ,  2uRm~1  ,  n,  e)  codes  with  en  — >  0.  The  classical  capacity  region 
of  the  broadcast  channel  is  defined  as  the  convex  hull  of  the  closure  of  all  achievable 
rate  M-tuples  (i?0,  •  •  • ,  Rm- i)-  The  classical  capacity  region  of  the  two-user  degraded 
quantum  broadcast  channel  with  discrete  alphabet  was  derived  by  Yard  et.  al.  [52], 
and  we  used  the  infinite-dimensional  extension  of  Yard  et.  al.’s  capacity  theorem  to 
prove  the  capacity  region  of  the  Bosonic  broadcast  channel,  subject  to  the  minimum 
output  entropy  conjecture  2.  The  capacity  region  of  the  degraded  quantum  broadcast 
channel  can  easily  be  extended  to  the  case  of  an  arbitrary  number  M,  of  receivers. 
For  notational  similarity  to  the  capacity  region  of  the  classical  degraded  broadcast 
channel,  we  state  the  capacity  theorem  first,  using  the  shorthand  notation  for  Holevo 
information  we  introduced  in  footnote  6  in  chapter  3. 


Theorem  B.l  —  The  capacity  region  of  the  M-receiver  degraded  broadcast  channel 
Ma-yq...yM-\  as  defined  in  Eq.  (B.l),  is  given  by 

R0  <  —I  (An;  F0"|Ti) , 

n 

Rk  <  h  {Tk-Y£\Tk+1)  V/c  G  {1, . . . ,  M  —  2}, 

Rm- i  <  -/(Tmu;^),  (B.5) 

n 

where  Tkl  k  G  {1, . . . ,  M  —  1}  form  a  set  of  auxiliary  complex  valued  random  variables 
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such  that  Tm~i  — >  TM_2  — >  ...  — >  T),  7\  — >  An  forms  a  Markov  chain,  i.e. 

PTM_l!...,Ti,yl"(*M-l,  •  •  •  M  ,j)  =  PTm-i^M- l)  I  PTfc^|Tfc  (4-1  |*fc)  J  PA^Tx  (j  |*l) , 

\k=M- 1  / 

(B.6) 

where  with  a  slight  abuse  of  notation,  we  have  used  the  symbols  Tj, . . .  ,Tm-i  to 
denote  complex-valued  classical  random  variables  taking  values  i*.  e  7*.  where  7). 
denotes  a  complex  alphabet,  as  well  as  to  denote  quantum  systems,  by  associating 
a  complete  orthonormal  set  of  pure  quantum  states  with  the  complex  probability 
densities  prk  ('4)  of  these  auxiliary  random  variables.  With  further  abuse  of  notation, 
we  have  used  An  to  denote  a  classical  random  variable.  See  footnote  5  in  chapter  3. 


In  order  to  find  the  optimum  capacity  region,  the  above  rate  region  must  be  opti¬ 
mized  over  the  joint  distribution  •  •  • ,  H,j)-  As  Holevo  information 

is  not  necessarily  additive  (unlike  Shannon  mutual  information),  the  rate  region  must 
also  be  optimized  over  the  codeword  block-length  n.  The  above  Markov  chain  struc¬ 
ture  of  the  auxiliary  random  variables  T*,,  k  G  {1 , ,M  —  1}  is  shown  to  be  optimal 
in  the  converse  proof  which  proves  the  optimality  of  the  above  capacity  region  with¬ 
out  assuming  any  special  structure  of  the  auxiliary  random  variables.  Also,  note 
the  striking  similarity  of  the  expressions  for  the  capacity  region  given  above,  with 
the  capacity  region  of  the  classical  M- receiver  degraded  broadcast  channel,  given  in 
Eqs.  (3.8).  Holevo  information  takes  place  of  Shannon  mutual  information  in  the 
quantum  case,  and  because  of  superadditivity  of  Holevo  information,  an  additional 
regularization  over  number  of  channel  uses  n,  is  required. 


The  capacity  region  above  can  be  re-cast  in  the  Holevo-infornration  notation  that 
we  used  earlier  in  this  chapter  for  the  two-receiver  quantum  broadcast  channel.  For 
the  channel  model  of  the  multiple-user  quantum  degraded  broadcast  channel  we  de¬ 
scribed  in  the  section  above  (pictorially  depicted  in  Fig.  B-l),  our  proposed  capacity 
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region  (in  Eqs.  (B.5))  can  alternatively  be  expressed  as 


Ro  <  y  (pAnpAjlhjipJ0  ) 


u 


n 


^ptAR) 


u 


s  (  ^2pAn\Tl(j\ii)pJ0  )  ~  ^2pAn\Tl{j\ii)S  (p 


1  /  y^\ 

Rk  <  -  2_^PTk+1(ik+i)x  (prfc|Tfc+1(4|4+i),P>'  J  ,  V/,-  e  { 1, ....  A/  -2}. 


*fc  +  l 


n 


^2 PTk+iAk+l ) 


*fc  + 1 


5  fe^|Tfc+1(4|4+i)/$f  )  -  X^AIA+1(4|4+i)S  (p£  ) 


^/c 


1  /  ^yn 

Rm~  1  <  -X  (  PTM_ !  (*M-l),  PiM-1 


S  Y.  PTm-iAm-Ap^J  -  (PtM-11) 


(B.7) 


,*M-1 


*M —  1 


Even  though  the  capacity-region  expressions  above  have  been  written  for  a  discrete 
alphabet,  it  can  be  generalized  to  a  continuous  alphabet  of  quantum  states  over  an 
infinite-dimensional  Hilbert  space,  in  which  case  the  summations  in  Eqs.  (B.7)  are 
replaced  by  integrals  (see  footnote  17  in  Chapter  3). 


B.3  Capacity  Region:  Proof  (Achievability) 

Proof  [Achievability  (M  =  3,  single  channel  use)]  -  It  is  more  instructive  to  do  the 
“achievability”  part  of  the  proof  first,  for  M  =  3  receivers.  The  general  proof  for  the 
M-receiver  case  is  a  logical  extension  of  this  proof.  We  need  to  prove  achievability 
only  for  the  single-channel-use  rate  region  (i.e.,  for  n  —  1  in  Eqs.  (B.5)),  because  the 
same  proof  can  be  applied  to  multiple-use  (larger)  quantum  systems  of  the  transmitter 
and  the  receiver  alphabets  to  obtain  the  general  capacity  region.  For  any  e,  5  >  0,  we 
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will  show  that  for  rate  3-tuples  (R0,  Ri,  R2)  satisfying1 


I(A]Y0\T1)-5(l  +  2d0)  <  R0<I(A;Y0\T1)  +  25I0,  (B.8) 

/(Txs^lT^-^l  +  dO  <  Ri  <  /(Tj;  Yi\T2)  +  SIi,  and  (B.9) 

o  <  R2  =  I(T2-Y2)-8,  (B.10) 

for  hnite  positive  real  numbers  do,  d\,  R,  I\,  there  exists  an  (2ni?0,  2nR> ,  2 nR2,  n,  0(e)) 
code  for  the  degraded  broadcast  channel  J\Ta-y0y \y2-  Below  is  a  brief  heuristic  of  the 
proof,  followed  by  the  actual  proof. 

We  will  construct  the  required  triply- indexed  set  of  codewords  ,mk&2nRk 

as  follows.  First,  we  will  select  a  rate  R2  code  for  the  channel  Mt2-y0y{y2  with  code¬ 
words  selected  in  an  independent,  identically  distributed  (i.i.d.)  manner  from  the 
distribution  px2(i 2),  which  conveys  the  message  index  m2  G  2nli'2  to  all  the  three  re¬ 
ceivers  Yq,  Y\  and  Y2.  We  call  these  codewords  the  “primary  cloud-centers”2.  There¬ 
after  for  each  i2  E  T2,  we  pick  a  code  of  rate  and  blocklength  approximately 
Pt2  {'82)ti  for  the  conditional  channel  A  j?_y  yi  with  codewords  selected  i.i.d.  according 
to  Px\\x2 (ii ^2) •  These  codewords  are  called  the  “secondary  cloud-centers”.  If  the  re¬ 
ceiver  Y\  knows  *2,  it  can  decode  at  rates  approaching  R.\.yi  ~  /(Tj;  Y\  \T2  =  i2),  such 
that  the  average  rate  R\  ~  ^2i2Px2(i2)Ri,i2  is  close  to  the  desired  rate  at  which  Yx  can 
decode  the  message  index  en  \ .  Yq  can  similarly  learn  rn  \  reliably  at  rates  approach¬ 
ing  R\.  Then  finally,  for  each  i2  G  T2,  and  i\  G  we  pick  a  random  HSW  code 
of  blocklength  approximately  npx2(i2)Px i|T2(*i|*2)  f°r  the  conditional  channel  , 

with  codewords  selected  i.i.d.  according  to  Pa|Ti,t20’|*1) *2)-  If  the  receiver  Y0  knows 
both  i2  and  i\  (for  our  case  as  T2  — >  Tf  — >  A  is  a  Markov  chain,  Y0  just  needs  to 
know  ii),  it  can  decode  at  rates  approaching  Ro,i2,h  ~  Y()\TX  =  R),  such  that  the 
average  rate  R0  ~  JT  hPTii^Pr^iAl^Ro^M  is  close  to  the  desired  rate  at  which 
Yo  can  decode  the  private  message  index  mo- 

^Trom  now  on,  we  will  freely  use  both  the  I(X\  Y),  and  the  more  explicit  x(Px(x),  p\  )  notations 
interchangeably,  for  Holevo  quantities. 

2To  read  more  about  layered-encoding  techniques  for  the  classical  degraded  broadcast  channel, 
using  “cloud-centers”  and  “clouds”,  see  [3]. 
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B.3.1  Constructing  codebooks  with  the  desired  rate-bounds 


Let  us  choose  arbitrary  e  >  0.  Pick 


I(T2,  Y2)  —  5 

(B.ll) 

X  (pr2(i2),p5)  -8 

(B.12) 

x{PT2(i2),p¥°YlY2)  -S. 

(B.13) 

Because  of  the  degraded  nature  of  the  channel, 


I(T2]Y2)  <  I(T2-Yx)  <  I(T2-Y0 )  <  I(T2-Y0,YUY2), 


(B.14) 


where  the  last  inequality  follows  from  the  fact  that  the  point-to-point  channel  Mt2-y0y1y2 
between  T2  and  the  joint  receiver  (Yq,Yi,Y2)  can  transmit  information  reliably  at  a 
rate  as  least  as  high  as  the  capacity  of  the  channel  Mr2-Yk  between  T2  and  one  of 
the  receivers  Yk  alone.  Hence  by  using  HSW  theorem  for  the  channel  ATt2-y0Y\Y2,  we 
obtain  an  (R2,n,e)  code  <,pm2,  A^2,A^2,A^2  >  with  all  codewords  chosen  i.i.d.  from 
Pr2(i 2)  and  of  type  P2(i2),  satisfying  \P2(-)  —  Pt2(')\i  <  S,  and  for  all  m2  G  W2, 

Tr  (A^  ®  Am2  ®  O  PS¥^  >l-e,  (B.15) 


where 


.  Vn  Vn  Vn 
ft1 0  X1  X2  — 

rm  2 


—  P771 2 

1=1 


yn  yn  yn 
x0,lxl,lx2,l 


' yn  yn  yn 

are  product-state  codewords,  with  p^i  13  2'1  =  (A/t2-y0Yi  Y2)!Xin 
being  the  /th  symbol  of  the  received  n-symbol  codeword3. 


/  Tn  \ 

yPm2  J ,  for  /  e  {1,  •  •  • ,  n}, 


Let  us  define  the  cardinalities  of  the  alphabets  of  T2,  Tj,  and  the  transmitter  A, 


3Note  that  throughout  this  discussion,  each  codeword  symbol  is  transmitted  in  a  single  use  of 
the  channel. 
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as  \T2\  =  d2, 

7)  =  d\  and  \A\  =  do-  For  each  i2,  dehne 

RlM  4  I(T1-Yi\T2  =  i2)-8i2<dl 

(B.16) 

<  I(Ti,Y0,Yi\T2  —  i2)  —  Si2  <  di 

(B.17) 

=  X  (PTi|T2  (*1 1*2)  j  Pi^Yl)  -  <^2; 

(B.18) 

where 

4  =  SP2(i2), 

(B.19) 

for  i2  G  {1, . . 

,,d 2}.  Dehne 

ei2  4  eP2(i2),  and 

(B.20) 

ni2  4  nP2(i2), 

(B.21) 

for  i2  G  {1  ,...,d2}.  For  each  i2  G  T2,  there  exists  an  rii2,  e^)  random  HSW 

f  n®2  1 

code  <  Pmi,i2 ,  ,  Am*2]2  >,  Vmi,j2  G  (l, . . . ,  2nHl-i2  },  for  the  conditional  channel 

J^T1-Y0Y1i  which  satisfies 


E 


2nRi,i2 


2~ni2Ri,i2  £  Tr((A^) 

mi,.2=l 


n,-  _  n;, 


>  1  -  ei2, 


(B.22) 


where  each  codeword  is  chosen  i.i.d.  from  PTi|T2(fi|?’2)  and  each  codeword  is  of 

the  type  -Pi|2(yi|i2),  such  that  |-Pi|2(-|*2)  —  J9Ti|t2('|*2)|i  <  di2,  and  the  expectation  is 
over  the  randomness  in  the  HSW  codes.  Note  that  owing  to  the  symmetry  of  the 
random  code  construction,  (B.22)  may  be  equivalently  expressed  as 


E 


Tr 


■a; 


m) 


)/h° 


>l-ei2. 


(B.23) 


Also  note  that  the  personal  rate  to  Y\  (to  decode  message  mi),  is  given  by 


R\  —  j2- 

*2 


(B.24) 
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We  also  have, 


&2 

l*M0  -PTbOli  =  E  \P-Ah)  - PT,(i2)\  <  S.  (B.25) 

22  =  1 

Using  Eqs.  (B.24)  and  (B.25),  we  now  derive  lower  and  upper  bounds  for  R\  as  follows. 

Rl  —  y^yP2(i2)Rl,i2  =  ^PT2(*2)-Rl,i2  -  Yj  (PT2(z2)  -  -P2(*2))  Rl,i2 

12  22  12 

—  y~lpr2(^2)-Ri,t2  ~  I -^2 ( ~ )  ~  Pr2(~)|i^i  (B.26) 

22 

=  inWiin  =  i2)  -  4]  -  |A(0  -PrMidi 

22 

=  I(Ti,Yi\T2)  -  ^PT2(i2)4  -  | P2(0  —  Pt2(-)Mi 

22 

=  /(Ti;W|T2)  -  <5  J>T2(i2)P2(i2)  -  |P2(0  -Pr2(-)|idi 
*2 

>  /(TpWlT^-^l  +  dO,  (B.27) 

where  inequality  (B.26)  follows  from  (B.16).  The  upper  bound  is  derived  as  follows: 

Ri  =  Y2P^  2)^  ^  ^^(u)/m;F1|T2=*2) 

22  22 

=  J>r2(i 2)/(Ti;  W|T2  =  i2)  +  X]  (A(*2)  -  Pt2(*2))  /(Ti;  W|T2  =  i2) 
22  22 

<  /(Ti;yi|T2)  +  |P2(-)  -pr2(-)|imax/(Ti;yi|T2  =  i2) 

22 

<  /(T1;y1|T2)  +  (5/1,  (B.28) 

where  /1  =  maxj2  /(Tq  Yi|T2  =  i2)  is  a  hnite  non-negative  real  number.  Combining 
Eqs.  (B.27)  and  (B.28),  we  have 

I (Ti,  W| T2)  -  <5(1  +  <h)  <  i?!  <  J(Ti;  WIT,)  +  (5/l  (B.29) 
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Now,  given  each  i2  G  T2 ,  we  define  for  each  i\  e  7) , 

-Ro,i2,ii  —  /(^;y0|Ti  =  *i)  —  <5*2,11  <  rfo  (B.30) 

=  X  (PA|Ti(iNi),pf )  -  5.2, ii,  (B.31) 

where 

<5.2, .1  —  <5.2 -fi  1 2(^1  ^2)  =  <5-^2  (^2)-fi|2  (^1  ^2)  ■  (B.32) 

Let  us  also  define 

e.2,u  -  ei2Pi|2(*i|i2)  =  eP2(*2)-Pi|2(*i|*2),and  (B.33) 

n.2,n  =  ni2-Pl|2(*l|*2)  =  nP2(i2)Pi\2(ii\i2).  (B.34) 


Given  a  fixed  i2,  for  each  ii,  there  exists  an  (-Ro,.2,n>  nh,in  e*2,.i)  random  HSW  code 
mo, ia.il  e  for  the  conditional  channel  A/^lyo, 

with  each  codeword  chosen  i.i.d.  from  pj4|t1,t2(.7’|*i> *2)  =  PA\T1{j\ii)i  and  each  code¬ 
word  satisfying 


E 


2nR0,i2 

<2~ni2,hR0,i2,h  \  TV  ^  A0(*2,*l)  fy* 0  2  1 

^0,^2  >*l 


>  1 


‘-*2,*1  • 


(B.35) 


Note  that  owing  to  the  symmetry  of  random  code  construction,  (B.35)  can  alterna¬ 
tively  be  expressed  as 


E 


TrlA 


>  1 


(B.36) 


The  personal  rate  to  Y0  (to  decode  its  personal  message  m0),  is  given  by 


Rq  ~  -^2  (^2)-Pl 1 2(^1  | ^2)-^0, .2, .1  • 


(B.37) 


*2,*1 


Lemma  B.2  —  Given  two  probability  density  functions  p(x)  and  q(x)  defined  on  the 
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same  alphabet  X  that  satisfy 


^  |p(x)  -  g(x)|  <  (B.38) 

and  given  that  the  conditional  distributions  p(y\x)  and  q(y\x)  defined  on  the  alphabets 
X  and  y,(xEX,yE:  y)  satisfy 

Y  \p(v\x)  ~  q(y\x)\  <  4,  vx,  (B.39) 

y&y 

Then  the  joint  distributions  p(y,  x )  =  p(y\x)p(x)  and  q(y,  x )  =  q(y\x)q(x)  must  satisfy 

X!  \p(y,x)**.q(y,x)\<S  +  YSx<l(x)-  (B-40) 

(a :,y)e(X,y)  x&X 


Proof  — 


Y  I  p(y,x)-g(y,x)\ 

(x,y)e(x,y) 

=  Y  I  ~  y(x))p(y\x)  +  ( p(v\x ) -  y{y\x))y{x) I 

{x,y)e(x,y) 

<  Y  \p(x)  -  y(x)\p(y\x)  +  Y  \p(y\x)  -  y(y\x)\y(,x) 


{x,y)e(x,y) 


(x,y)e(x,y) 


=  Y I p(x)  ~  y(x^  +  Y  ( Y  \p(y\x) ~  y(y\x)\ )  y(x) 

xGX  xGX  \y£y 

<  6  +  Yjxqjx). 

xex 


(B.41) 

(B.42) 

(B.43) 

(B.44) 

(B.45) 


Now,  we  use  Eq.  (B.37)  and  Lemma  B.2  to  derive  lower  and  upper  bounds  on  R0. 
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The  derivation  of  the  lower  bound  proceeds  as  follows. 


R<)  —  ^2(*2)-Pl|2(*l|*2)-Ro,i2,U  (B.46) 

»2,*1 

=  ^PTi|T2(*l|*2)PT2(*2)-Ro,i2,ii  —  ^  (PTi|T2(*i|*2)PT2(*2)  ~  A (*2)-Pl|2 (*1  K2))  Ro,i2,h 
12, h  i2,h 

—  T'Pnnh’*)*™  ~  do'^/\PT1\T2{H\'i'2)PT2{^2)  "i*  ^2(*2)-Pl|2(*l|*2)|  (B.47) 

*2,h  »2,n 

—  y~lPTL,T2(^i)  ^2)  (/(-A;  Y0\Tl  =  ii)  —  5i2til)  —  d0  (d  +  P2 (^2)^1; 


(B.48) 


*2,^1 


«2 


=  /(24;y0|Ti)-(5  5]pTl!T2(i1,i2)P2(i2)Pi|2(i1|i2)-do  (5  +  ^P2(i2)2J  (B.49) 

>  /(A;  y0|Ti)  -  5  -  d0(S  +  5)  (B.50) 

=  I(A-Y0\T1)-8(l  +  2d0)r  (B.51) 


where  (B.47)  follows  from  Eq.  (B.30),  (B.48)  follows  from  Eq.  (B.30)  and  Lemma 
B.2,  and  (B.50)  follows  from  the  fact  that  J2xPi(x)p2(x)  <  1  for  two  probability 
distribution  functions  pi(x)  and  p2(x)  defined  on  a  common  alphabet.  The  derivation 
of  the  upper  bound  proceeds  as  follows. 


^  -F2(*2)Pl|2(*l|*2)-Ro,i2,n 

22  ,^1 

(B.52) 

y]P2(i2)Pi|2(ii|i2)/(24;y0|Ti  =  *i) 

22,^1 

(B.53) 

y^)^(ii\i2)I(A;  To  Pi  =  ii) 

*2,H 


+  ^  (P2(i2)Pi|2(n|*2)  -  Pt2(*2)Pti|t2(*i|*2))  I  (A;  To  I  Pi  —  P)  (B.54) 

*2,*1 

<  I(A-,Y0\ Ti) 

+  max/(A;y0|Ti  =  P)  V'  |P2(*2)-Pi|2(*i|*2)  -Pt2(*2)Pti|t2(*i|*2)|(B.55) 

u  z ' 

*2,*1 

<  /(24;y0|Ti)  +  2<J/0,  (B.56) 

where  Jq  =  rnaXj,  I(A]Yq\T]  =  p)  is  a  finite  non- negative  real  number.  Combining 
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Eqs.  (B.51)  and  (B.56),  we  have 


I  {A-  y0|Ti)  -  5(1  +  2 d0)  <R0<  I  (A-  y0|7i)  +  25  J0.  (B.57) 

Combining  inequalities  (B.ll),  (B.29)  and  (B.57),  we  have  constructed  codebooks  for 
the  degraded  broadcast  channel  J\[a-y0y \y2  transmitting  the  messages  (mo,  mi,  m2)  at 
a  rate  3-tuple  (Ro,  Ri,  R2),  that  can  be  brought  arbitrarily  close  to  the  postulated 
ultimate  capacity  region  (B.5)  with  M  =  3  and  n  =  1,  by  choosing  5  small  enough. 
What  remains  to  be  shown,  in  order  to  complete  the  proof  of  achievability  of  the 
postulated  capacity-region,  is  to 

(i)  instantiate  the  codewords  of  the  codes  we  constructed  above,  and 

(ii)  to  construct  measurement  operators  for  the  receivers  to  decode  the  messages, 
and  show  that  those  measurement  operators  lead  to  an  average  overall  error- 
probability  that  goes  as  0(e)  for  sufficiently  large  blocklength  n. 

The  above  tasks  are  dealt  with  in  the  following  two  sections. 

B.3.2  Instantiating  the  codewords 

Let  us  denote  the  quantum  states  associated  with  the  auxiliary  random  variables  7\ 
and  T2  as  follows  —  Tk  =  {<7fcl,  <j*,)2,  . . . ,  &k,dk}i  f°r  k  G  (1,2).  Recall  that  all  the 
codewords  pm2  are  of  the  same  type  P2(-),  for  m2  G  kh2.  Without  loss  of  generality, 
let  us  assume  that  the  primary-cloud-center  codewords  are4 

P l2  =  0-2,1  ®  ®  ■  ■  ■  ®  & 2 X  (B.58) 

d2 

=  (B.59) 

42  =  1 

4Note  that  \  —  d’f,kk  =  ^2,fc  ®  •  •  •  <S> 02 ,k  (rife-fold  tensor  product).  Also,  recall  from  Eq.  (B.21), 
that  n  =  ni  +  ri2  +  ■  ■  ■  +  Ud2  ■ 
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and  7t2 (m2)  is  a  collection  of  permutations  on  n  elements,  such  that 


Pm2  =  7r2(nr2)  (fi  )  ,  Vm2  G  W2.  (B.60) 

/sTn  z? 

For  each  primary-cloud-center  codeword  pm2,  2nHl  secondary-cloud-center  codewords 

rpn 

pnl ,  ,m2  are  chosen  for  every  mi  G  Bp .  Each  symbol  of  the  secondary- cloud-center 
codewords  ,m2  is  chosen  from  i.i.d.  from  7i  according  to  the  distribution  Pt1|t2(*i|*2)- 
As  2nRl  =  (using  Eqs.  (B.24)  and  (B.21)),  we  may  uniquely  identify 

each  message  m i  G  W\  with  a  collection  of  messages  for  i2  G  {1, . . . ,  d2},  and 
mi ti2  G  W\ti2  =  { 1, . . . ,  2nRl'i2  j.  Hence,  we  have 


A/p 

Pm\  ,m2 


vr2(m2) 

7r2(m2) 


Pmi,l 

^rpTll 
O1! 
rm  i,i 


p\ 


T™2 
'mi, 2 


P 


rji  “2 
/  1 

ml,d2 


vr2(m2) 


(B.61) 

(B.62) 

(B.63) 


rj- 1  *2 

Now,  each  one  of  the  codewords  is  °f  the  same  type  -Pi^Hm)-  Hence,  without 

loss  of  generality,  we  can  assume  that5 


A  ~  Hio  ,1 

=  O'  2 


1,1 

cfl 


er 


U2,2 

1,2 


a 


Mi 


u=i 


O' 


li2>dl 

l,di 


(B.64) 

(B.65) 


and  7riii2(miii2)  is  a  collection  of  permutations  on  nl2  elements,  such  that  for  each 

^2  G  ^2, 

ni  /  \ 

Pm1!,^  =  ^1,12(^1,12)  (  pf1  J  ,  VmMa  G  WMa.  (B.66) 

Without  loss  of  generality,  m  1  =  1  can  be  mapped  to  (m^i,  mii2, . . . ,  mi^2)  = 

(1,1,  •••,!),  he- 

rpn  rpn\  rpn  2  rpU &2 

Pi,i  —  Pi1  ®  P\  ®...®pA1  .  (B.67) 


5Note  that  ni2til  =  ni2Pi|2(h|i2),  and  thus,  nl2  =  n,;2, 1  +  ni2>2  +  . . .  + 
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Now  we  can  define  a  permutation  by  cascading  the  permutations  7rlji2(miij2), 

d-2 

7Ti(mi)  =  07ri,i2(mi,i2),  (B.68) 

*2=1 


such  that 


7Ti(mi) 


Combining  this  with  Eq.  (B.61)  we  have, 


. rrr 

P  1 


mi  ,7712 


=  7r2(m2)  o  7Ti(mi)  (^i 

(d2  dx 

0®dw 

*2  =  1  *1  =  1 


(B.69) 

(B.70) 


However,  neither  the  primary  nor  the  secondary  cloud-center  codewords  are  the  ac¬ 
tual  codewords  sent  out  by  the  transmitter,  as  they  are  after  all  drawn  from  hy¬ 
pothetical  auxiliary  alphabets.  With  dyW1  G  %  given,  the  final  transmitted  code¬ 
words  are  drawn  from  Alice’s  alphabet  around  each  secondary-cloud-center  code¬ 
word,  and  are  chosen  i.i.d.  symbol-by-symbol  from  the  conditional  distribution 
Pa|Ti,t20’|*i,  *2)  =  Pa\T\  (i|b),  Vi2  (because  T2  — >  Ti  — >  A  is  a  Markov  chain).  As 
2nR°  =  11*2  *1  2ni2.iiR°, i2>n  (using  Eqs.  (B.37)  and  (B.34)),  we  may  uniquely  identify 
each  message  m0  G  Wo  with  a  collection  of  messages  m0)j2)il  for  (A,  i2)  G  (7i,  7^),  and 
wio, *2, u  G  Wo.ia.ij  =  {l, . . .  ,2nRo^i},  Vil  ,  i2.  Hence,  the  transmitted  codewords  are 


given  by 


ssj\ri 

Pmo,mi  ,7722 


7r2(m2)  oTTi(mi)  (pf0,i,i) 


=  vr2(  m2)  o  7Ti(mi) 


P. 


An  2.1 
»**0, 2,1 


p; 


A™2’2 

mo, 2, 2 


~AnM  ~  -A"1.2 
P mo, 1,1  ^  P mo, 1,2 

A"2’di 


•  •  -P 


m0,2,d1 


=  7T2(m2)  o  7Ti(mi, 


(rf2  <ii 

0® 

*2  =  1  *1=1 


P 


A  *2.*1 

^0,^2  Hi 


•  •  •  P 


mo.i.di 


p; 


A"d2,l 

mo,d2,i 


(B.71) 


p; 


A"d2>2 

mo,d2,2 


•  •  •  p 


And2.dl 

m0  ,d2  ,dj 


(B.72) 


In  summary,  given  a  message  triplet  (mo,mi,m2),  Alice  first  represents  the  message 
mo  as  a  collection  of  messages  from  smaller  index-sets  mo^n  G  ,  and  generates 

the  codeword  j  for  (mo,  mi  =  1  ,m2  =  1)  as  shown  above.  Thereafter,  she  applies 
the  permutations  7r1(mi)  and  vr2(m2)  respectively  in  that  order,  to  obtain  the  final 
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codeword  p^omi  m2  to  be  broadcast  on  the  channel6. 


B.3.3  Receiver  measurement  and  decoding  error  probability 

The  decoding  process  proceeds  in  three  stages  (M  stages  in  general),  which  unravel  the 
information  from  the  layered  cloud-center  and  cloud  encoding  technique  we  employed 
earlier.  We  start  this  section  with  a  brief  description  of  the  decoding  process  and  how 
it  works.  We  then  follow  it  up  with  constructing  the  actual  measurement  operators 
for  the  three  receivers,  and  provide  a  rigorous  error  analysis  in  order  to  bound  the 
overall  average  probability  of  decoding  error. 


Steps  of  the  decoding  process 


The  following  are  the  steps  of  the  decoding  process: 


1.  y0,  I'i,  and  Y2  measure  {A,6l2},  {A^},  and  |A^2}  respectively  on  their  re- 

yn 

spective  received  states  Pm0,mi,m2 ,  and  they  declare  their  respective  results  of 
measurement,  to  be  the  common  message  W2 ■ 

2.  Y0  and  Y\  permute  their  respective  codewords  according  to  for  k  G 

{0, 1}  respectively.  If  Y0  and  Y\  correctly  decoded  m2  in  step  1  above,  af¬ 
ter  applying  the  permutations,  they  should  jointly  see  a  state  that  is  close 
to  i-  They  measure  each  block  of  ni2  symbols,  i2  £  using 

6 


(i)  The  joint  received  codewords  are  given  by 


Pmo,rhi,rri2  A ^A—YqYiY2  \Pmo,m\,rri2 


( ) 

I  rmo,mi,iri2  J  • 


(B.73) 


ynynyr i 

(ii)  On  averaging  out  the  received  codeword  pm0,mi,m2  over  messages  toq  and  mi,  we  obtain 


E„ 


Pmo,mi  ,m2 


^  )  PWo,Wi  (™0i  Tni)Prn0,m\ 

(m0,'mi)e(Wo,Wri) 


Y"Y,nY" 
m2 


—  _  \[®n 

-^T2-Y0Y1Y2 


=  Prtl2 


(pm\)  ■ 


(B.74) 


(iii)  To  find  the  state  received  by  Yq,  we  must  trace  out  the  other  receivers: 


{aZ21}  and  |  Am|2^  |  respectively,  and  concatenate  their  measurement  results 
•  •  • ,  }  —  r'n[k\  k  G  {0, 1},  which  they  declare  to  be  their  de¬ 

coded  message  W\ . 

i  roi  Yn 

3.  Finally  Y0  applies  the  permutation  7rj”  (rh\  )  and  obtains  a  state  close  to  Pm0,i,i- 
It  measures  using  the  measurement  operators  <S>’^=1  ^(^)^'=1  A and  con¬ 
catenates  its  results  {m0 =1  to  obtain  the  estimate  which  it  de¬ 
clares  as  its  decoded  message  W0. 


Construction  of  the  measurement  operators 

The  above  procedure  can  be  summarized  by  the  action  of  the  following  POVM  ele¬ 
ments  (measurement  operators)  for  the  three  receivers,  which  (adhering  to  the  nota¬ 
tion  set  forth  in  the  beginning  of  section  B.2  above)  are  given  by: 


1.  Y2-{ A^}. 

2-  Yt  —  {A }nim2j,  where 


A1  4 

777,17712 


A1  A 


^/Am2Ami  \m2  y//Am2 ,  and 


^2 


-  7T2(m2)  (  (X)A^ 


02  =  1 


(B.76) 

(B.77) 


3.  Y'o  —  {A?„omi,„,},  where 


{A 


0 

777-0777-1 777-2 


} 


A 


0 

mi  |m2 


A 


o 

m0|m1m2 


d2  di 


’riK) 


02  =  1  *1=1 


(B.78) 

(B.79) 


(B.80) 
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Error  analysis 


Our  goal  is  to  prove  that  with  the  codewords  and  the  measurement  operators  we 
have  constructed  above,  the  overall  average  probability  of  correct  decision  Pmomim2  = 
1  —  0(e),  where 


P, 


mo  mp  m2 


Tr  (A 


0 

777077717772 


<E>  A 


i 

m\m2 


®  Am2)  P 


ynvnyn 
J0  X1  X2 
mo, mi, m2  ’ 


(B.81) 


We  will  use  the  following  two  lemmas,  whose  proofs  can  be  found  in  [52]: 
Lemma  B.3  —  If  0  <  A  <  1,  then 


Tr(Acr)  >  Tr(Ap)  —  | p  —  cr|i 


(B.82) 


Lemma  B.4  —  If  0  <  A  <  1,  and  E[Tr(Ap)]  >  1  —  e  then 


E 


\'/XpVA  —  p\i  <  V8e 


(B.83) 


Let  us  begin  by  defining  two  intermediate  states  in  the  decoding  process: 


P 


P 


/ynynyn,  a 

OO  X1  X2  = 
mo, mi, m2 

HYfYfY?  A 

7770,7771, m2 


A0 

777 2 


A1 

7772 


a  o  \  ^Ynynyn 

A2  o0  1  2 

7712  I  r  7710,7711,1712 


A0 

7772 


^-m2  ^  V^-m2  /  ? 


A0  , 

»ni|m2 


A1  .  I  nY o  yi 


i  n  0  1  2 

rai|m2  /  r mo, mi, mi 


A°  , 

mi|m2 


A1  I 

mi\m2 


The  average  probability  of  correct  decision  Pmom,m2  can  be  expressed  as 


E[P, 


momiTO2J 


> 


> 


E[Tr(A°  , 

L  \  mo  |  mi  m2  r  mo  ,mi  ,m2  /  J 

E[Tr(A°  ,  5y”yni 

L  V.  mo|mim2^mo,mi,m2/J 


7770, mi, m2 


7770,7771, m2 


^ f/YnYn 

p"  0  1 


7710,7711,7712 


-E  [\p[ 

E  [Tr 
-E  [\p, 

— E  \\&‘o J1  "2  —  o 

Llrmo,mi,m2  r, 


~/YnY7 

p  0  1 


7770  ,7771, m2  1 

n'vn  \rn  \rn 
)X0  X1  X2 
7770, mi, m2 


1 


J 

J. 


(B.84) 

(B.85) 


(B.86) 
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where  (B.85)  and  (B.86)  follow  from  Lemma  B.3.  In  order  to  bound  the  last  term  in 
(B.86),  let  us  consider  the  following: 


A1^)  )o,YonYrY2 

Wll,i2  /  ^7711,7712 


E[Ii(A^®A^)*g££] 

=  ElTr(Mm2)l(g)A« 

V  \*2  =  1  *2=1 

=  E  iTr  I  I  (  7T2  1  (m2)  PmiJm2Y^) 

*  2  =  1  *2  =  1  / 

=  E|Tr|  |(g)AS»>(g)A«N)^J«" 

22  =  1  22  =  1  / 

Tr  |  |  0AJW  ®0AW')  pZT" 

*2  =  1  *2  =  1  / 


>  E 

-E 


Irmi.l  rrai.l  |1 


>  E 


Tr 


d  2  cfe 

I  ^-rni}2  ®  0 

V22  =  l  22  =  1 


P 


Y  *2  Y 

iI0  1 1 

ml,^2 


V.22  =  1 


IIE 


22  =  1 


n,-  riA, 


Tr(A?<«®A 


-  v'Se 


^2 


>  1  -  ^  ei2  -  v7^ 

22  =  1 
d2 

—  1  -  y:  eP2(*2 )  -  v7^ 

22  =  1 

—  1  —  (6  +  V&) 


-  v7^ 


(B.87) 

(B.88) 

(B.89) 


(B.90) 

(B.91) 

(B.92) 

(B.93) 

(B.94) 

(B.95) 


4i 


Cl) 


(B.96) 


where  we  define  ei  =  e  +  a/8c.  Eq.  (B.87)  follows  from  Eqs.  (B.77)  and  (B.79).  Also 
note  that  we  drop  the  message  index  mo  in  (B.87),  because  the  expectation  averages 
out  mo,  as  the  measurement  operator  (A^^  0A)ni|m2)  has  no  m o  dependence.  Equa¬ 
tion  (B.88)  simply  results  from  the  fact  that  permuting  the  measurement  operators  is 
equivalent  to  inverse-permuting  the  codeword  instead.  Equation  (B.89)  follows  from 
the  definition  of  the  permutation  712(1712),  and  (B.90)  follows  from  Lemma  B.3.  In 
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obtaining  the  first  term  in  inequality  (B.91),  we  drop  the  superscript  YJ’  from  the 
received-state  density  operator  (because  it  doesn’t  change  the  value  of  the  expecta¬ 
tion,  as  the  measurement  (A^  |m2  8  acts  only  on  the  joint  Hilbert  space  of 


n,'  n,- 


/Vn  Vn  j  y  *2v  l2 

Y0n  and  Y™),  and  we  use  Eqs.  (B.61)  and  (B.63)  to  express  p m°  /  =  Pmj,i2  1 

To  obtain  the  second  term  of  the  inequality  (B.91),  first  note  that  (B.15)  specialized 
to  m2  =  1,  implies  that  Tr  ( (A?  8  AJ  8  A^p,^)  2  j  >  1  —  e,  Vmi.  Also  note  that  by 
definition,  p^°  J1  ^2  =  (^a/A?  8  \/A[  8  \/A?)  ®  a/M  ®  \/Ai)  •  Hence, 

Lemma  B.4  implies  E  \pr£1,i  2  ~  Pm °  i1  2  |i  A  V^e-  Equation  (B.92)  follows  from 
the  symmetry  of  random  code  construction,  that  we  earlier  observed  in  going  from 
(B.22)  to  (B.23).  Inequality  (B.93)  follows  from  (B.23)  and  Eq.  (B.94)  follows  from 
the  definition  (B.20). 


Continuing  from  (B.86),  we  have 


E [Pm0mim2.  —  E  [Tr  (A rno\mim2Pm0,m1,m2} . 


=  E 


Tr  A^*A  "j  (n2 (m2)  o  m (mi )  (g)  (g)  p^J^ 

V  *2  =  1  *1=1  /  \  *2  =  1  *1=1 

v^Se  —  V8e[ 

d2  di  \  /  d2  d\ 


=  E 


(B.97) 


(B.98) 


Tr  7r2(m2)  o  7Ti(mi)  ®  ®  ^(rn,)  O  7Ti(mi)  (g)  (g) 


>P. 


=  E 


*2=1  *1  =  1 

— V8e  —  \/8e7 

(d2  d1  \  /  <i2  di 

88AS;:’,  88C7, 

*2  =  1  *1  =  1  /  \*2  =  1  *1  =  1 

cfe  di 


*2  =  1  *1=1 


—  v^e  —  \/8e7 


nnE 


l  <>(«,«)  ^V2'’1 


—  v^e  — 


PIA?— PT 

22  =  1  21  =  1 

cfo  <*1 

—  a/86  — 

*2  =  1  *1  =  1 
d2  d\ 

=  >-EE  e-Pi|2(ij|f2)-P2(i2)  -  \/8e  — 

22  =  1  2i  =  l 


—  1  ^6  +  x/8i  +  ^8(6  +  Vfc) 

=  1-0(6), 


v  *2  ’*1 
AO 

7770,^2  Al 


(B.99) 

(B.100) 

(B.101) 

(B.102) 

(B.103) 

(B.104) 

(B.105) 
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where  (B.97)  follows  from  (B.86),  (B.96),  and  two  applications  of  Lemma  B.4.  Equa¬ 
tion  (B.98)  follows  from  (B.80)  and  (B.72).  Note  that  dropping  the  superscripts  Y™ 
and  YJl  on  the  received  joint  quantum  state  in  Eq.  (B.97)  doesn’t  make  a  difference, 
as  the  measurement  operators  j }  ac^  onb  011  ^ie  Hilbert  space  of  Yq.  Equa¬ 
tion  (B.99)  follows  from  the  fact  that  the  measurement  operators  |^m0|Tnim2  j  do  not 
depend  on  m2,  and  hence  can  be  chosen  arbitrarily  up  to  a  permutation  7r2(m2).  Next, 
we  remove  the  permutations  ^2(7712)  °  7Ti(mi)  from  both  the  parentheses  in  (B.99), 
so  that  the  trace  remains  unchanged  in  Eq.  (B.100).  Equation  (B.101)  follows  from 
the  symmetry  of  the  HSW  code  construction,  (B.102)  follows  from  (B.36),  (B.103) 
follows  from  the  definition  (B.33),  and  (B.105)  completes  the  proof. 

B.3.4  Proof  of  achievability  with  M  receivers 

The  proof  of  the  achievability  of  the  capacity  region  of  the  M-receiver  degraded 
quantum  broadcast  channel  (B.5)  is  a  straightforward  generalization  of  the  M  —  3 
case  we  proved  above.  We  will  not  go  through  every  single  detail  of  the  M -receiver 
achievability  proof  here,  but  we  will  rather  sketch  the  proof.  Similar  to  the  M  —  3 
case,  we  need  to  prove  achievability  only  for  n  —  1,  because  the  same  proof  can  be 
applied  to  n-use  (larger)  quantum  systems  of  the  transmitter  and  the  receivers  to 
obtain  the  general  n  >  1  capacity  region  (B.5). 

For  any  e,  5  >  0,  we  aim  to  show  here  that  for  rate  M-tuples  (Rq,  . . . ,  Rm-i) 
satisfying 

I  (A)  ko|Tj)  —  <5(1  +  (M  —  l)cZ0)  <  JR0</(H;r0|T1)  +  (M-l)(5/o, 
I(Tk-,Yk\Tk+1)-5(l  +  (M-k-l)dk)  <  Rk<I(Tk-Yk\Tk+1)  +  (M-k-l)SIk, 

0  <  Rm-i  —  Bm-i)  —  8,  (B.106) 

there  exists  an  (2nfio,...,2 ni?M_1,n,0(e))  code  for  the  degraded  broadcast  channel 
A/’J4_y0...yM_1,  where  c4  =  \%\  is  the  cardinality  of  the  alphabet  associated  with  the 
auxiliary  random  variable  T \  and  the  cardinality  of  the  transmitter’s  alphabet,  |^4|  = 
do-  Ik  —  nia Xjfc+1  /(Tfc;  Yk\Tk+i  =  ik+ 1)  are  hnite  non-negative  real  numbers. 
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Using  HSW  theorem  [27,  28,  29]  for  the  channel  Mtm_1-y0...ym-u  let  us  obtain  a 

f  Tn  'l 

(Rm- i,  n,  e)  code  , . . . ,  j  with  all  codewords  chosen  i.i.d. 

from  the  distribution  Ptm^m-i)  of  type  PM- i,  satisfying  \PM-i  ~ 
and  for  all  rriM-i  £  Wm- i, 

(M- 1  \ 

Tr  <g>  A™„-,  >  1  -  e,  (B.107) 

\k= 0  / 

yn  yn  yn  y^n 

where  —  ^1=1  PmM-iM1’1  are  product-state  codewords.  Treating  these 

codewords  as  the  primary  cloud-centers,  for  each  iM-i  e  7m-  1,  we  choose  another 

rpUiM-  1 

layer  of  codewords  for  the  conditional  channel  Mt^X-Yo  yM-2>  P^ked  i.i.d. 

from  the  distribution  which  form  a  random  HSW  code  of  rate 

Rm—2,im_i  •  Taking  the  average  of  these  rates  over  the  entire  codebook,  the  desired  rate 
bound  I(TM-2',YM-2\TM-i)  —  <5(1  +  du- 1)  <  Rm- 2  <  7(TM-2!  Ym-2\Tm-i)  +  &Im- 2 
can  be  established  for  Rm- 2-  Continuing  in  this  manner,  we  keep  selecting  HSW 
codewords  from  the  alphabets  of  the  auxiliary  random  variables  with  the  appropriate 
conditional  distributions,  viz.  by  applying  HSW  theorem  to  the  channel  A/^“l.’y0’*!y  , 

to  select  a  code  of  overall  rate  _R/_i  close  to  the  desired  bound  (B.106).  Proving  the 
rate  bounds  involve  applications  of  Lemma  B.2  and  simple  manipulations  similar  to 
those  leading  to  the  rate  bounds  for  R\  and  Rq  in  the  M  —  3  proof  we  did  earlier. 


Codewords  and  measurement  operators  are  selected  in  a  layered  way,  exactly  as 
we  did  earlier  for  the  M  =  3  case.  For  the  chosen  measurement  and  codewords,  the 
bound  for  the  average  probability  of  correct  decision  works  out  to  be 

E  [Pm0:.mM-i\  —  1  —  +  . . .  +  ^/8cm-2^  ,  (B.108) 

where  ei+1  =  e*  +  v7^,  for  i  E  {0, . . . ,  M  -  3},  and  e0  =  e.  Hence,  E  [Pmo...mM-i]  > 
1  —  0(e),  as  desired.  The  proof  parallels  the  layered  codebook  construction  technique 
used  for  classical  degraded  broadcast  channels,  and  works  out  pretty  much  in  the 
same  manner  as  the  M  —  3  proof. 
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B.4  Capacity  Region:  Proof  (Converse) 


Our  goal  in  proving  the  converse  to  the  capacity-region  proof  is  to  show  that  any 
achievable  rate  M-tuplc  (R0, . . . ,  Rm- i)  must  be  inside  the  ultimate  rate-region  pro¬ 
posed  by  Eqs.  (B.5).  Let  us  assume  that  ( Ro , . . . ,  Rm- i)  is  achievable.  Let  {xn(mo, . . . , 
and  POVMs  },  },  . . }  comprise  a  ( 2nR° , . . . ,  2nR™-\ 

code  in  the  achieving  sequence.  Let  us  suppose  that  the  receivers  Y(h  . . . ,  YM_ i  store 
their  respective  decoded  messages  in  registers  Wo, . . . ,  Wm- i-  Then,  for  real  numbers 
£n,fc  — >•  0,  we  have  for  k  G  {0, 1, . . . ,  M  —  2} 


= 

H(Wk) 

(B.109) 

< 

I(Wk-  Wk )  +  neHik 

(B.110) 

< 

(  Yn  \ 

X  yPwk(mk),  pmkJ  +  nenM 

(B.lll) 

< 

Y  Pwk+1(mk+1)x  (pwk  {jnk),  )  nen,fc 

(B.112) 

rafc+i 

= 

/(Wfc;yfcn|Wfc+1)+nen,fc, 

(B.113) 

where  (B.110)  and  (B.lll)  follow  from  Fano’s  inequality  and  the  Holevo  bound 
respectively.  Equation  (B.112)  follows  from  concavity  of  Holevo  information  (as 

yn  yn 

pmk  =  Yhmk+1  Pwk+1  (mk+i)p^k+1)-  For  k  =  0,  we  further  have 

1k 

nR0  <  nWo-Y^Wj  +  e^o  (B.114) 

<  I(An-Y0n\W1)  +  en,0,  (B.115) 

where  (B.115)  follows  from  the  Markov  nature  of  (Wo,. . . ,  Wm-i)  — >  An  — >  Yq  — > 
Y£i_i.  We  also  have  similarly,  for  eUjM- 1  - >  0; 


tiRm-i  — 

nH(WM_  i) 

(B.116) 

< 

/(WM-i;  Wm- i 

)  +  neU)M-i 

(B.117) 

< 

X  (pwM-i(.mM- 

'Ym  i  \ 

1)5  PmM- 1  )  +  nen,M- 1 

(B.118) 

= 

i(wm-i;yz_  i) 

+  n£n,M-i- 

(B.119) 

n,e) 
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Choosing  Tk  =  W k  for  k  G  {1,  2, . . . ,  M  —  1}  completes  the  proof. 
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Appendix  C 


Theorem  on  property  of  g(x) 


The  converse  proofs  of  the  capacity  region  for  the  Bosonic  broadcast  channel  with  and 
without  thermal  noise,  in  chapter  3,  use  a  theorem  on  a  property  of  the  Bose-Einstein 
entropy  function,  g(x)  =  (1  +  x)  ln(l+x)  —  xlnx,  in  order  to  conclude  Eqs.  (3.59)  and 
(3.90).  In  this  appendix,  we  prove  two  lemmas  which  lead  to  the  proof  of  a  theorem. 
After  that,  we  show  how  the  theorem  implies  Eqs.  (3.59)  and  (3.90),  as  two  simple 
special  cases. 

Lemma  A.l  —  For  all  real  numbers  x  >  0,  C  >  0,  and  0  <  k  <  1,  the  following 
inequality  holds: 


ln  (X  +  TTTc)  kx(1  +  x) 

In  (l  +  y)  —  (kx  +  C)(l  +  KX  +  C) 

Proof —  Define  a  function  f(x)  =  x(l  +  x)  ln(l  +  1/x).  We  claim  that  f(x)  has  the 

following  properties1: 

1  Proofs  — 

1.  We  can  express  /( x)  as,  f(x)  =  x{g{x)  —  lna;).  Therefore,  lima,^0/(a;)  =  ~ 

lim^^ol^hra:).  It  is  readily  verified  by  applying  the  L’  Hopital’s  rule,  that  lima,^o(£<?(£))  = 
linx^ol^hia;)  =  0. 

2.  By  straightforward  differentiation,  f"(x)  =  21n(l  +  1/x )  —  (2x  +  l)/{x{l  +  x)).  Claim: 

ln(l  +  y)  <  y(y  +  2)/2 (y  +  1),  Vy  >  0.  Proof:  It  is  easy  to  see  the  following: 

•  Both  the  left  and  right  hand  sides  of  the  proposed  inequality  go  to  zero  at  y  =  0. 

•  Both  ln(l  +  y)  and  y(y  +  2)/2 (y  +  1)  are  positive  for  y  >  0. 
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1.  lirn^o f(x)  =  0. 

2.  f(x)  is  a  concave  function,  i.e.,  the  second  derivative  f"(x)  <  0,  for  x  >  0. 

3.  f(x)  is  monotonically  increasing  for  x  >  0. 

Given  properties  1  and  2  above,  we  have  f(nx)  >  htf(x),  for  x  >  0  and  0  <  n  <  1. 
We  further  have  from  property  3  above,  that  for  any  non-negative  real  number  C  >  0, 
f  ( kx  +  C)  >  f(nx),  for  x  >  0  and  0  <  k  <  1.  Combining  the  two  above,  we  obtain 
/ ( kx  +  C)  >  nf{x).  Substituting  the  explicit  form  of  f(x),  we  have  Eq.  (C.l),  that 
we  set  out  to  prove. 

Lemma  A. 2  —  The  following  holds: 


^g{ng  \y)  +  C)  >0,  (C.2) 

for  y  >  0,  where  C  is  a  non-negative  real  number. 

Proof —  Let  us  define  p(y)  =  g  {ng^1{y)  +  C ).  Differentiating  twice  with  respect  to 
V,  we  get 


d  2p{y) 
d  V2 


1“(1+K9-1(!/)+c)  (dj,2®  1(9)) 
'K\^-1(y)  +  C)(l  +  Kg-'(y)+C)  (dr*  1(9) 


(C.3) 


Now  consider  the  identity  g(g  1(y))  =  y,  and  substitute  g  l(y)  =  x.  Differentiating 


cl  ln(i  +  y)  <  i, 


y(y+2) 

2(y+l) 


,  for  y  >  0. 


Hence,  ln(l  +  y)  <  y(y  +  2)/2 (y  +  1),  My  >  0.  Substituting  y  =  l/x,  we  get  f"(x)  <  0,  for 
x  >  0. 

3.  By  straightforward  differentiation,  f'(x)  =  (2x+l)  lnfl+l/a;)  —  1.  Claim:  ln(l+y)  >  y/(y+ 2), 
My  >  0.  Proof:  It  is  easy  to  see  the  following: 

•  Both  the  left  and  right  hand  sides  of  the  proposed  inequality  go  to  zero  at  y  =  0. 

•  Both  ln(l  +  y)  and  y/(y  +  2)  are  positive  for  y  >  0. 


^ln(l  +  y)>  A 


v 

y+2 


,  for  y  >  0. 


Hence,  ln(l  +  y)  >  y/(y  +  2),  My  >  0.  Substituting  y  =  l/x,  we  get  f'(x)  >  0,  for  x  >  0. 
Since  liinx. _ =  0,  f(x)  must  be  monotonically  increasing  for  x  >  0. 
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both  sides  of  the  identity  with  respect  to  y,  we  get  (dg(x)/dx)(dx/dy)  =  1,  which 
implies  dx/dy  =  1  / (dg (x) / dx) .  Therefore,  we  get 


1 


In  ^1  + 


5 


and  thus, 


d2  1 

d?9  1<9)  =  rtKi  +  r’W) 


lnl1  +  iph) 


3  • 


Substituting  Eqs.  (C.4)  and  (C.5)  into  Eq.  (C.3)  we  finally  obtain, 


(C.4) 


(C.5) 


d 2p(v) 
d  y2 


hi 


0_1(s/)(i  +  g-'iv)) 


in  ( l  +  — f 


( v ) 


1  2 


In  ( 1  T  — _])  , 

'  (y)+c 


In  ( 1  + 


l 


i^g  (y)0-  +  g  (y)) 


{Kg~1(y)  +  C){  1  +  ng-^y)  +  C)  J 


>  0, 


(C.6) 

(C.7) 


where  the  last  inequality  follows  from  using  Lemma  A.l,  along  with  the  fact  that 

g~\y)  >  0,  Vy  >  0. 

Theorem  A. 3  —  Given  non-negative  real  numbers  xk  €  M+,  for  k  G  {1, . , . ,  n},  and 
0  <  k  <  1,  if  xo  is  defined  by 

n 

y2~g(xk)  =  g(x0),  (C.8) 

'  n 

k=  1 

then  the  following  inequality  holds: 

n. 

y2~g(nxk  +  C)  >  g(hix0  +  C),  (C.9) 

k= 1 

where  g(x)  =  (1  +  x)  log(l  +  x)  —  xlog(x),  and  C  >  0. 

Proof  -  Because  g(x)  is  a  1  —  1  function,  we  can  define  unambiguously  the  inverse 
function  h(y)  =  g~l{y)-,  such  that  y  =  g(x)  =  x  —  h{y )  for  x,  y  >  0.  Dehne  yk  = 
g(xk),  y'k  -  g  (Ktr'ilJk)  +  C)  and  l{yk)  =  yk  -  y'k,  for  k  €  {0, 1, . . .  ,n}.  Rephrasing 
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the  theorem  in  terms  of  h(y),  we  have  the  following  theorem.  Given 


yo  =  ~y^yk,  Vk  >  0, V7e, 
n 


k= 1 


the  following  is  true: 


n 


55^  >  y'o- 


k= 1 


Using  Lemma  A. 2,  it  follows  that  l(y )  =  y  —  y'  =  y  —  g  (ng  1(y)  +  C)  is  a 
function  in  y.  i.e.  /"(y)  <  0.  Thus,  Eqn.  (C.10)  implies 

1  n 

Kv o)  >  -y'^(yfc), 

n 


k= 1 


which  implies 


2/o  -  2/o  A 


n 

1 

>  - 


fc=i 

n 


55 yk  ~  ~  55  y,k- 

k=  1  k= 1 


Using  Eq.  (C.10),  we  thus  have 


n 


55^  -  Voi 


k= 1 


which  completes  the  proof.  Eqs.  (3.59)  and  (3.90)  follow  as  straightforward 
quences  of  Theorem  A. 3,  as  shown  below. 

Corollary  A. 4  —  Given 


^9  tnPkNk)  =  g  (rj0)  , 


and  r/  >  1/2,  we  have  that 


55  ^9  ((1  -  v)PkNk)  >g({  1  -  rj)PN) 


(C.10) 

(C.ll) 

convex 

(C.12) 

(C.13) 

(C.14) 

(C.15) 

conse- 

(C.16) 

(C.17) 
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Proof —  Substitute  xk  =  rj/3kNk,  x0  =  ij/3N,  n  =  l/2nRc  and  k  =  (1  —  rj)/rj.  As 
rj  >  1/2,  it  follows  that  0  <  k  <  1.  Using  these  substitutions,  Eq.  (C.17)  follows  from 
Theorem  A. 3,  with  C  —  0. 

Corollary  A. 5  —  Given 

E  +  (1  -  n)N)  =  g  (n!3N  +  (1  -  g)N)  .  (C.18) 

k 

and  ij  >  1/2,  we  have  that 

E  (I1  -  l)PkNi  +  gN)  >  g  ((1  -  g)/3N  +  gN)  .  (C.19) 

k 

Proof —  Substitute  xk  =  1 r]/3kN +  (1  —  r])N,  x0  =  rj/3N  +  (1  —  r])N,  n  =  l/2nRc  and 
k  =  (1  —rj)/r).  As  q  >  1/2,  we  have  0  <  k  <  1.  Using  these  substitutions,  Eq.  (C.19) 
follows  from  Theorem  A. 3,  with  C  =  (2 rj  —  l)N/ri  >  0. 
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Appendix  D 


Proofs  of  Weak  Minimum  Output 
Entropy  Conjectures  2  and  3  for 
the  Wehrl  Entropy  Measure 


This  appendix  contains  the  proofs  of  the  Wehr  1-entropy  versions  of  the  weak  conjec¬ 
tures  2  and  3  that  do  not  draw  upon  the  Entropy  Power  Inequality  (EPI).  As  we 
pointed  out  in  chapter  4,  the  EPI  quickly  leads  to  the  Wehrl-entropy  proofs  for  the 
strong  forms  of  all  the  minimum  output  entropy  conjectures.  We  still  include  the 
following  proofs  in  the  thesis  for  the  sake  of  completeness,  and  because  these  proofs 
could  be  of  mathematical  interest  in  their  own  right. 

Wehrl  entropy  is  the  Shannon  differential  entropy  of  the  Husimi  probability  func¬ 
tion  Q(p)  for  the  state  p  [64],  i.e.,  for  a  single  mode  we  have 

W(p)  =  -J  Q{p)  In  [vrQ(/i)]d2/q  (D-l) 

where  Q{p)  =  (p\p\p)/iT  with  | p)  a  coherent  state.  The  Wehrl  entropy  provides  a 
measurement  of  the  state  p  in  phase  space  and  its  minimum  value  is  achieved  on 
coherent  states  [64], 


223 


D.l  Weak  conjecture  2 


The  following  single-mode  version  of  conjecture  2  was  stated  in  chapter  4: 

Weak  Conjecture  2  —  Let  a  lossless  beam  splitter  have  input  a  in  its  vacuum 
state,  input  b  in  a  zero-mean  state  with  von  Neumann  entropy  S(pB )  =  g(K),  and 
output  c  from  its  transmissivity -rj  port.  Then  the  von  Neumann  entropy  of  output  c 
is  minimized  when  input  b  is  in  a  thermal  state  with  average  photon  number  K,  and 
the  minimum  output  entropy  is  given  by  S(/3G)  =  g((  1  —  rf)K). 

The  following  is  an  analogous  statement  of  the  conjecture  for  the  Wehrl  entropy: 

Weak  Conjecture  2:  Wehrl  —  Let  a  lossless  beam  splitter  have  input  a  in  its  vac¬ 
uum  state,  input  b  in  a  zero-mean  state  with  Wehrl  entropy  W(pB )  =  1  +  In  (K  +  1), 
mid  output  c  from  its  transmissivity-p  port.  Then  the  Wehrl  entropy  of  output  c  is 
minimized  when  input  b  is  in  a  thermal  state  with  average  photon  number  K,  and  the 
minimum  output  entropy  is  given  by  W(pc)  =  1  +  In  (K(  1  —  rj)  +  1). 

Proof  -  Before  we  begin  the  proof  of  the  Wehrl-entropy  conjecture,  let  us  recall  a 
few  definitions.  The  antinormally  ordered  characteristic  function  Xa(C)  °f  a  state  p 
is  given  by: 


xi(C)  =  tr  (pe~t‘V4')  .  (D.2) 

Also,  the  antinormally  ordered  characteristic  function  Xa(C)  and  the  Husimi  function 
Qp(p)  =  (p\p\p)/n  of  a  state  p  form  a  2-D  Fourier- Transform  Inverse- Transform  pair: 

xlK)  =  j  QMe^'-Od2^  (D.3) 

QM  =  i  J  (D.4) 

As  the  two  input  states  to  the  beamsplitter  are  in  a  product  state,  Eq.  D.2  im¬ 
plies  that  the  output  state  characteristic  function  is  a  product  of  the  input  state 
characteristic  functions  with  scaled  arguments: 
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xf(()  =  x^MOxTiV^O  (D.6) 

The  input  a  is  given  to  be  in  the  vacuum  state.  Thus,  the  Husimi  function  and  the 
Wehrl  entropy  of  the  input  a  are  given  by: 


QiM  =  (D.6) 

TT 

W(pA)  =  1.  (D.7) 

Equation  D.5,  and  the  multiplication-convolution  property  of  Fourier  transforms  (FT) 
give  us 


Qpc  (aO 


V®PA  (vv)  *  (!  -v) 


Q 


/x 


=  1  r-M2Am.  1 

ttt]  (1  -  rj) 


Q 


PB  ' 


(D.8) 


PB 


\/i  -  n 


where,  we  used  the  scaling-property  of  FT:  xA( VvO  * - *  {^-/rl)Qp{l1/  x/v)- 

If  the  state  of  the  input  b  is  a  thermal  state  with  mean  photon  number  K ,  i.e., 


Pb  =  ^jt  (  e  |q|2/a \a)(a\d2a, 

we  find  that 

W(pB)  =  1  +  In  (if  +  1),  (D.9) 

which  satisfies  the  hypothesis  of  our  Wehrl-entropy  conjecture.  Using  Eq.  D.9,  we 
can  then  write  out  the  Husimi  function  of  the  output  state  c: 


obtaining 


Qpc  (aO 


7t(1  +  (1  -  rj)K) 


3-H2/(i+U -v)k) 


W (pc)  =  1  +  ln(/l  (1  —  rj)  +  1), 


(D.10) 


(D.ll) 
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for  the  resulting  Wehrl  entropy,  which  provides  us  with  an  upper  bound  to  the  mini¬ 
mum  output  Wehrl  entropy: 

W {pc)  <  1  +  ln(/l  (1  -  rj)  +  1).  (D.12) 

To  show  that  the  expression  in  Eq.  D.12  is  also  a  lower  bound  for  W (pc),  we  use 
Theorem  6  of  [67],  which  states  that  for  two  probability  distributions,  f{n)  and  h(n ) 
on  C,  we  have 


W((f  *  >  XW(f{n))  +  (1  -  A )W(h(n))  -  A  In  A  -  (1  -  A)  ln(l  -  A)  (D.13) 

for  all  A  G  [0, 1],  where  /  *  h  is  the  convolution  of  /  and  h  and  where  the  Wehrl 
entropy  of  a  probability  distribution  is  found  from  Eq.  4.2  by  replacing  with  the 
given  probability  distribution.  Choosing 

/M  =  lQ^(-|),and  (D.14) 

(yr=y)  • 

we  get 

W(pc)  >  A(1  +  In  i,)  +  (1  -  A  )W  (75=)  )  -  A  In  A  -  (1  -  A)  ln(l  -  A). 

(D.15) 

It  is  straightforward  to  show  that  the  Wehrl  entropy  of  a  scaled  distribution 
{l/x)Q(n/ \/x)  is  given  by 

-^yj  =W(Q(fj,))+\  nx,  (D.16) 

for  any  ieR.  From  Equations  D.16  and  D.15,  we  obtain 
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W{pc)  >  A(1  +  In  77)  +  (1-  A)  {W{pB)  +  ln(l  -  77))  (D.17) 

—Ain  A  —  (1  —  A)  ln(l-A) 

=  A(1  +  In  77)  +  (1  -  A)  (1  +  In (K  +  1)  +  ln(l  -  77)) 

—A  In  A  —  (1  —  A)  ln(l-A) 

=  1  +  ln(/l  (1  -  77)  +  1) 

where  the  last  equality  uses  A  =  77/(77  +  (K  +  1)(1  —  77))  G  [0, 1],  V77,  K .  Therefore  the 
minimum  output  Wehrl  entropy  of  c  must  satisfy  the  lower  bound 

W(pc')  >  1  +  ln(/l  (1  —  77)  +  1).  (D.18) 

The  upper-bound  (Eq.  D.12)  and  the  lower-bound  (Eq.  D.18)  on  the  minimum 
output  Wehrl  entropy  coincide,  and  thus  we  have  the  equality: 

W(pcO  =  1  +  ln(/l  (1  —  77)  +  1),  (D.19) 

which  is  achieved  by  a  thermal-state  pb  with  mean  photon  number  K  (Eq.  D.24), 
thus  proving  the  conjecture  for  the  minimum  output  Wehrl  entropy. 

D.2  Weak  conjecture  3 

The  following  single-mode  version  of  conjecture  3  was  stated  in  chapter  4: 

Weak  Conjecture  3  —  Let  a  lossless  beam  splitter  have  input  a  in  a  zero-mean 
thermal  state  with  mean  photon  number  N,  input  b  in  a  zero-mean  state  with  von 
Neumann  entropy  S(pB )  =  g(K),  and  output  c  from  its  transmissivity-p  port.  Then 
the  von  Neumann  entropy  of  output  c  is  minimized  when  input  b  is  in  a  thermal 
state  with  average  photon  number  K,  and  the  minimum  output  entropy  is  given  by 
S(pc)=g(VN  +  (l-V)K). 
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The  following  is  an  analogous  statement  of  the  conjecture  for  the  Wehrl  entropy: 

Conjecture  3:  Wehrl  —  Let  a  lossless  beam  splitter  have  input  a  in  a  zero-mean 
thermal  state  with  mean  photon  number  N,  input  b  in  a  zero-mean  state  with  Wehrl 
entropy  W(pB )  =  1  +  In  (If  +  1),  and  output  c  from  its  transmissivity -p  port.  Then 
the  Wehrl  entropy  of  output  c  is  minimized  when  input  b  is  in  a  thermal  state  with 
average  photon  number  K,  and  the  minimum  output  entropy  is  given  by  W (pc)  = 
1  +  In  (r/ N  +  (1  —  77  )K  +  1). 

Proof  —  Our  proof  of  the  Wehrl-entropy  conjecture  for  the  thermal-noise  pA  parallels 
what  we  did  for  the  vacuum-state  pA.  As  before,  we  have  that 


xT(0  =  (D.20) 

Now,  however,  the  input  a  is  in  a  zero-mean  thermal  state  with  mean  photon  number 
N.  Thus,  its  Husirni  function  and  Wehrl  entropy  are  given  by: 


Qpa  (aO 


=  1  r-\W/(N+l) 

n(N+l) 

W(pA )  =  l  +  ln(7V  +  l). 


(D.21) 

(D.22) 


From  Eq.  D.20,  and  the  multiplication-convolution  property  of  Fourier  transforms 
(FT)  we  get 


QiiM  ~  vQf,A{!/v)*  (1-  77) 


Q 


p 


7TT](N  +  1) 


PB 1  yr ~1 
1 


(D.23) 


:Q 


(i  -  n)  B  \  Vl-  v 


p 


If  the  state  of  the  input  b  is  a  thermal  state  with  mean  photon  number  K,  i.e., 


Pb  =  Zff  j  e  |q|2/A  \a)(a\d2a, 


228 


we  have 


W(pB)  =  1  +  HK  +  1),  (D.24) 

which  satisfies  the  hypothesis  of  our  thermal-noise  Wehrl-entropy  conjecture.  Using 
Eq.  D.9,  we  can  write  out  the  Husirni  function  and  the  Wehrl  entropy  of  the  output 
c: 


Qpc  (aO 

W{pc) 


_ _ _  -|^|2/(l+(l -V)K+VN) 

7r(l  +  (1  —  p)K  +  rjN) 

1  +  \n(r)N  +  K{  1  -  p)  +  1), 


(D.25) 

(D.26) 


which  gives  us  the  upper  bound 


W (pc)  <  1  +  HnN  +  K{  1  -  rj )  +  1).  (D.27) 

To  show  that  the  expression  in  Eq.  D.12  is  also  a  lower  bound  for  W (pc),  we  use 
Eq.  D.13,  and  definitions  in  Eq.  D.15  to  obtain: 


W(pc)  >  A(l+ln(7,(JV+l)))+(l-A)W  (yr=y 

From  equations  D.16  and  D.28,  we  find 


—A  In  A— (1— A)  ln(l— A). 

(D.28) 


W{pc) 


> 


A(1  +  In (V(N  +  1)))  +  (1  -  A)  (W(pB)  +  ln(l  -  rj)) 

—A  In  A  —  (1  —  A)  ln(l-A) 

A(1  +  In {V(N  +  1)))  +  (1  -  A)  (1  +  ln(/l  +  1)  +  ln(l  -  p)) 


—A  In  A  —  (1  —  A)  ln(l-A) 
'  r){N  +  1) 


1  +  A  In 


A 


+  (1  -  A)  In 


(K  +  !)(!-//) 
(1-A) 


1  +  ln(77iV  +  K(1  —  p)  +  1) 


(D.29) 
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where  the  last  equality  used  A  =  r](N+l)/(r](N+l)  +  (K+l)(l—r]))  G  [0, 1],  V77,  K,  N. 
Therefore  the  minimum  output  Wehrl  entropy  of  c  must  satisfy  the  lower  bound 

W (pc)  >  1  +  ln(^iV  +  K(  1  —  77)  +  1).  (D.30) 

The  upper  bound  (Eq.  D.27)  and  the  lower  bound  (Eq.  D.30)  on  the  minimum 
output  Wehrl  entropy  coincide,  and  thus  we  have  the  equality: 

W  (pc)  =  1  +  ln(r]N  +  K(  1  —  77)  +  1).  (D.31) 

which  is  achieved  by  a  thermal-state  ps  with  mean  photon  number  K  (Equation  D.24), 
thereby  proving  the  thermal-noise  Wehrl-entropy  conjecture. 
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